AddWeb Solution: Continuous Delivery with Drupal - The Need of the Hour!

Planet Drupal - 21 December 2018 - 4:02am

Continuous Delivery - A trending word in the world of technology. Continuous Delivery(CD), along with Continuous Integration(CI), is becoming a popular term even with the non-technical people. And hence, every IT company is seeing a flood of clients coming with the demand of both of them. Both of these techniques - CI/CD are closely associated with the quality-oriented work methodologies - Agile and DevOps. And so are we!


Team AddWeb has been for years been associated with and following Agile and DevOps. Just as we’re associated with Drupal. No wonder, we have been ardently following continuous delivery with Drupal for years now. So let us first throw some light on this popular concept of ‘Continuous Delivery with Drupal’.


What is Continuous Delivery?

Continuous Delivery is a process of automatically deploying all your changes made on development stage, directly to the production stage. This kind of delivery is done by accepting all the unit-cases followed by coding standards. Once the code is merged with the stage branch from the development branch, the same stage branch also gets an automatic update with the help of Jenkins and git-webhook, which is triggered by merging the branches. A similar process of automatic delivery is also followed on the production site; where the code is merged with the master branch from stage branch, which is later deployed to the production servers.


Team AddWeb, as we mentioned previously, has been persistent followers of CD with Drupal via Jenkins, Ansible, and RocketChat. We believe, in today’s day and age, CD, and CI hold so much of significance because one can define repetitive tasks for one time and then on every build the same mentioned steps will run in order to update the new content. And when we speak of so much so of its importance, let us also share the tools, block diagram and process that we, at AddWeb, choose to follow for Continuous Delivery.

, ,

CD/CI Tools Used by Team AddWeb:

There are multiple tools that can we used to follow the process of Continuous Delivery. Let us share the ones that we, at AddWeb, have been using for years. You can consider this as a recommendation from us, for the amount of experience we hold in using them successfully for all these years.

  1. Git

  2. Docker

  3. Jenkins

  4. Ansible

  5. Rocket-chat

Block Diagram Used by Team AddWeb:

Just as a picture is worth a thousand words, a diagram for us - the techies, is worth a hundred written words. We at AddWeb, do understand and empathise with this fact and hence, here’s the block diagram that we personally use for Continuous Delivery.

, ,

Process Followed by Team AddWeb:

Every developer has their own process to be followed for Continuous Delivery. Here’s the one that team AddWeb choose to follow:

  1. As soon as the developer pushes code into the git repo, a webhook will be triggered. This will call Jenkins, which will further run the Ansible playbooks. These Ansible playbooks comprise of the code of delivery process, which is eventually followed by sending a push notification to the RocketChat server once the code is successfully built. One also receives this push notification in case of built failure condition.

  2. Pipeline code is written in Ansible playbook for a continuous delivery process:
    - Create a backup of code and database
    - Pull the latest code on the server by git pull
    - Run composer install for Drupal-8 sites to install new modules, libraries
    - Run drush updb -y
    - Run drush cim -y
    - Run drush cr
    - Send notification of successful build or failed build job details into Rocketchat

Hope the overall understanding and the provided guideline regarding the much-talked about and significant Continuous Delivery has proven helpful to you. In case, you have something to add on to the above information or even correct anything, feel free to contact us. Also, you can share what do you want us to share in our next blog. We’re all ears for suggestions and recommendations.

Categories: Drupal

Referer to Entity Reference

New Drupal Modules - 20 December 2018 - 5:20pm

Allows you to turns the HTTP referer field into the default value for an entity reference.

Enable support for a specific entity reference by editing the field, and ticking the "Set value from HTTP Referer field if it matches a valid url for an entity on your site" checkbox.

If the referer url is a valid entity reference, it will be used to populate the default value for the entity reference.

Categories: Drupal

Drupal core announcements: Seeking volunteers to fill new core team facilitation roles

Planet Drupal - 20 December 2018 - 1:13pm

Over the past five years, to meet the growing velocity in Drupal 8 core development and facilitate a more mature release process, we've gradually grown the Drupal 8 core committer team from two people to four, then six, then twelve people.

We've reached a team size where we'd benefit from additional team members whose primary focus is helping the committer team function more effectively, through facilitating process; communicating with other maintainers, initiative teams, and the community; and organizing meetings and discussions.

To this end, the core team is adding two additional roles to our governance (more details behind the link):

  • A committer team facilitator role, responsible for helping organize and run committer discussions. The committer team facilitator supports the committer team in the team's priorities (but does not set these priorities). This important project management assistance will allow the core committer team to spend more time reviewing and committing patches, which will increase the quality and speed of improvements.
  • A core initiative facilitator role, responsible for supporting core initiative teams across initiatives and helping initiative coordinators. This is a very important role because it helps initiative teams to deliver software that meets end user needs and brings better community awareness to the efforts going on within initiatives.

Both roles are estimated to be a 10-15 hour/month commitment, and we're suggesting a renewable one year term for each.

Adding project management backing to the team will help us be more effective, and to better focus on those roles and tasks that only committers can do. I'm excited about this direction, as it embodies our principle of everyone has something to contribute, valuing non-technical contributions at the same level as technical contributions by making these roles a formal part of the committer team.

If you're interested in one or both of these new roles, please get in touch!

Categories: Drupal

Drupal Atlanta Medium Publication: Your Holiday Gift from DrupalCamp Atlanta: Session Videos Now Live

Planet Drupal - 20 December 2018 - 10:00am
To Zach Sines and Taylor Wright, It’s not goodbye, it’s see you later.2018 DrupalCamp Atlanta Group Picture

Thanks to all of the presenters and participants who attended 2018 DrupalCamp Atlanta (DCATL). We are excited to provide you with a little holiday gift. The Session Videos are now live. View here

I would also like to thank the awesome DCATL team that I had the pleasure to work with:

  • Sarah Golden — Acquia
  • Nikki Smith — Sevaa
  • Zach Sines — Manhattan Associates
  • Taylor Wright
Overcoming Challenges

As with any event, this year’s DCATL had some interesting twists and turns that we were able to overcome. The biggest and most noticeable one, of course, was the construction that was happening at the hotel. Two weeks before the event, I met with the hotel event staff to discuss our setup. On my way into the hotel, everything looked as I expected and it was business as usual. When I entered the lobby I noticed they were putting up a temporary wall that blocks off the hotel bar. During our discussion I was informed there was going to be some construction going on during our camp but as ensured that the event space wouldn’t be impacted.

The DCATL team arrived at the hotel to load in and everyone was mortified when we saw the front of the building. No more than 10 minutes after we arrived, I received a message from one of the trainers asking, “are we still having the conference?” We immediately started thinking about how we can alleviate the situation so we took a picture of the building an sent an email out to everyone ensuring them that the interior of the building was okay and that we were still going to have an awesome conference.

It wasn’t all doom and gloom. 10 days before the camp, we were still short on the financials and were kind of sweating it out (although we had reserve funds to cover the costs) thinking of ways that we could reduce costs without getting rid of too much programming. I received a phone call from an employee at Turner, asking if they could be a Diamond Sponsor and would also like to sponsor the after party. WOW! I couldn’t believe we were getting bailed out in the last minute, phew!

2019 Goals for DrupalCamp Atlanta

After the camp, I got a chance to have lunch with a mentor of mine and we talked about where are the next generation of Drupalers going to come from and what purpose camps serve today vs ten years ago. So based on our discussion here are my top two goals I would like to propose to the DCATL organizing team.

Increase the Number of Case Studies with co-presentations from Drupal shops and their Clients.

Another topic we discussed was how Acquia Engage has taken a different approach by showcasing their clients and providing opportunities for Drupal shops to schedule a meet and greets talk with their clients. During the opening session at DCATL I asked the audience, “raise your hand if you have invited a client to attend or co-present at DrupalCamp Atlanta.” Out of all the attendees maybe 2 raised their hands.

Increase the Number of Student Attendees

When looking some of my Drupal colleague's user profiles so many of us over 10 years. This means we are getting old folks :) But more importantly, where are the net generation of Drupalers going to come from. The state of Georgia has 114 colleges and 326,609 students. I know it takes a lot of energy but we have to figure out a way to use our camp as a pipeline for nurturing the next generation of Drupalist.

It's Not Goodbye Its See You Soon

For the past 5.5 years, I have had the pleasure to work with Zach Sines and Taylor Wright as board members of the Atlanta Drupal Users Group (ADUG). Both Zach and Taylor were key stakeholders in the restructuring of the organization. Zach took on the writing of the bylaws that states how people are elected, what are the rules for participating, what are the roles and responsibilities of each officer and so on. Taylor has a ton of finance experience so he took on the responsibility of cleaning up our financials and paying all of our bills. These two have been by my side, even after heated discussions and have been what I like to call my nice translators. Sometimes I have the tendency to be too blunt so they were always had my back :).

Zach in the Green on the Left. Taylor in the Green on the Right

Earlier this year, both Zach and Taylor informed all of us that 2018 will be their last year serving on the board. Not to get too mushy but I am going to miss them both a lot, I mean a ton. Not just for their expertise but hearing their voices on our monthly calls and some of their hilarious stories. But what is great about Drupal is that you build some lasting relationships and now I consider these two my friends. Thank you for all the work you have put into running these events, and I know this is not goodbye its soo you soon.

ADUG is Looking for New Board Members

With our current vacancies, the Atlanta Drupal User Group (ADUG) is currently looking for new board members to join our team. While the serving on a board can sound intimidating we are really just a bunch of Drupalers who want to give back to the community. All of our meetings are held on a video call. If you are interested or know some who would be a great fit, please feel free to contact us.

Your Holiday Gift from DrupalCamp Atlanta: Session Videos Now Live was originally published in Drupal Atlanta on Medium, where people are continuing the conversation by highlighting and responding to this story.

Categories: Drupal

DrupalEasy: DrupalEasy Podcast 214 -Travis Carden - Drupal Spec Tool

Planet Drupal - 20 December 2018 - 6:55am

Direct .mp3 file download.

Travis Carden, (traviscarden), Senior Software Engineer at Acquia joins Mike Anello to talk about the spreadsheet-based Drupal Spec Tool, a very cool tool that allows teams to specify different parts of a Drupal site and then generates diagrams and Behat tests.

Interview DrupalEasy News Upcoming events Sponsors
  • Drupal Aid - Drupal support and maintenance services. Get unlimited support, monthly maintenance, and unlimited small jobs starting at $99/mo.
  • WebEnabled.com - devPanel.
Follow us on Twitter Subscribe

Subscribe to our podcast on iTunes, Google Play or Miro. Listen to our podcast on Stitcher.

If you'd like to leave us a voicemail, call 321-396-2340. Please keep in mind that we might play your voicemail during one of our future podcasts. Feel free to call in with suggestions, rants, questions, or corrections. If you'd rather just send us an email, please use our contact page.

Categories: Drupal

Hide login

New Drupal Modules - 20 December 2018 - 6:05am


Simple and lightweight module to move (aka hide) the user login form.

Categories: Drupal

DrupalCon News: Community Connection - Tara King

Planet Drupal - 20 December 2018 - 4:49am

We’re featuring some of the people in the Drupalverse! This Q&A series highlights individuals you could meet at DrupalCon.

Every year, DrupalCon is the largest gathering of people who belong to this community. To celebrate and take note of what DrupalCon means to them, we’re featuring an array of perspectives and fun facts to help you get to know your community.

Categories: Drupal

Agiledrop.com Blog: Top Drupal blog posts from November 2018

Planet Drupal - 20 December 2018 - 3:21am

To continue with our tradition of compiling the top blog posts involving Drupal from the previous month, we’ve prepared a list of blog posts from November 2018 that stuck with us the most.

Categories: Drupal

Droptica: It’s been a year with Droopler!

Planet Drupal - 20 December 2018 - 1:00am
Soon, we will be celebrating the first anniversary of the day Droopler – our open Drupal 8 distribution – was released. It is a perfect time for some summaries and plans for the future. In this article, I’m going to show you how Droopler works and what awaits its users in the upcoming release.
Categories: Drupal

Bulk Entity Action

New Drupal Modules - 19 December 2018 - 10:11pm

Under Development.

Categories: Drupal

Token Debug

New Drupal Modules - 19 December 2018 - 9:01pm

Adds an admin page that helps debug tokens.

Categories: Drupal

Views CSS Style

New Drupal Modules - 19 December 2018 - 8:37pm

Provides a views style plugin that renders inline css to head area.

Categories: Drupal

Jeff Geerling's Blog: Deploying an Acquia BLT Drupal 8 site to Kubernetes

Planet Drupal - 19 December 2018 - 2:28pm

Wait... what? If you're reading the title of this post, and are familiar with Acquia BLT, you might be wondering:

  • Why are you using Acquia BLT with a project that's not running in Acquia Cloud?
  • You can deploy a project built with Acquia BLT to Kubernetes?
  • Don't you, like, have to use Docker instead of Drupal VM? And aren't you [Jeff Geerling] the maintainer of Drupal VM?

Well, the answers are pretty simple:

Categories: Drupal


New Drupal Modules - 19 December 2018 - 2:14pm

Provides Drupal integration with the jsSHA library, a JavaScript implementation of the complete Secure Hash Standard family as well as HMAC.

Categories: Drupal

Aten Design Group: GraphQL with Drupal: Getting Started

Planet Drupal - 19 December 2018 - 10:29am

Decoupling Drupal is a popular topic these days. We’ve recently posted about connecting Drupal with Gatsby, a subject that continues to circulate around the Aten office. There are a number of great reasons to treat your CMS as an API. You can leverage the content modeling powers of Drupal and pull that content into your static site, your javascript application, or even a mobile app. But how to get started?

In this post I will first go over some basics about GraphQL and how it compares to REST. Next, I will explain how to install the GraphQL module on your Drupal site and how to use the GraphiQL explorer to begin writing queries. Feel free to skip the intro if you just need to know how to install the module and get started.

A Brief Introduction to GraphQL

Drupal is deep in development on an API First Initiative, and the core team is working on getting json:api into core. This exposes Drupal's content via a consistent, standardized solution which has many advantages and responds to REST requests.

Recently the JavaScript community has become enamored with GraphQL, a language for querying databases which is touted as an alternative to REST for communicating with an API.

Developed by Facebook, GraphQL is now used across the web from the latest API of Github to the New York Times redesign.

GraphQL opens up APIs in a way that traditional REST endpoints cannot. Rather than exposing individual resources with fixed data structures and links between resources, GraphQL gives developers a way to request any selection of data they need. Multiple resources on the server side can be queried at once on the client side, combining different pieces of data into one query and making the job of the front-end developer easier.

Why is GraphQL Good for Drupal?

GraphQL is an excellent fit for Drupal sites, which are made up of entities that have data stored as fields. Some of these fields could store relationships to other entities. For example, an article could have an author field which links to a user.

The Limitations of REST

Using a REST API with that example, you might query for “Articles”. This returns a list of article content including an author user id. But to get that author’s content you might need to do a follow-up query per user ID to get that author’s info, then stitch together that article with the parts of the author you care about. You may have only wanted the article title, link and the author name and email. But if the API is not well designed this could require several calls to the server which returned way more info that you wanted. Perhaps including the article publish date, it’s uuid, maybe the full content text as well. This problem of “overfetching” and “underfetching” is not an endemic fault with all REST based APIs. It’s worth mentioning that json:api has its own solutions for this specific example, using sparse fieldsets and includes.

Streamlining with GraphQL

With GraphQL, your query can request just the fields needed from the Article. Because of this flexibility, you craft the query as you want it, listing exactly the fields you need (Example: the title and URL, then it traverses the relationship to the user, grabbing the name and email address). It also makes it simple to restructure the object you want back; starting with the author then getting a reverse reference to Articles. Just by rewriting the query you can change the display from an article teaser to a user with a list of their articles.

Either of these queries can be written, fields may be added or removed from the result, and all of this without writing any code on the backend or any custom controllers.

This is all made possible by the GraphQL module, which exposes every entity in Drupal from pages to users to custom data defined in modules, as a GraphQL schema.

Installing GraphQL for Drupal

If you want to get started with GraphQL and Drupal, the process requires little configuration.

  1. Install the module with Composer, since it depends on a vendor library GraphQL-php If you're using a Composer based Drupal install use the command:
    composer require drupal/graphql
    to install the module and its dependencies.
  2. Enable the module; it will generate a GraphQL schema for your site which you can immediately explore.
Example Queries with GraphiQL

Now that you have GraphQL installed, what can you do? How do you begin to write queries to explore your site’s content? One of the most compelling tools built around GraphQL is the explorer, called GraphiQL. This is included in the installation of the Drupal GraphQL module. Visit it at:


The page is divided into left and right sides. At the left you can write queries. Running a query with the button in the top left will display the response on the right pane.

Write a basic query on the left side, hit the play button to see the results on the right.

As you write a query, GraphiQL will try to autocomplete to help you along.

As you type, GraphiQL will try to autocomplete With entities, you can hit play to have it fill in all the default properites.

You can also dive into the live documentation in the far right pane. You'll see queries for your content types, the syntax for selecting fields as well as options for filtering or sorting.

Since the schema is self documenting, you can explore the options available in your site.

The documentation here uses autocomplete as well. You can type the name of an entity or content type to see what options are available.

Add additional filter conditions to your query.

Filters are condition groups, in the above example I am filtering by the "article" content type.

In the previous example I am just getting generic properties of all nodes, like entityLabel. However, if I am filtering by the "Article" type, I would want access to fields specific to Articles. By defining those fields in a "fragment", I can substitute the fragment right into my query in place of those individual defaults.

Use fragments to set bundle specific fields.

Because my author field is an entity reference, you'll see the syntax is similar to the nodes above. Start with entities, then list the fields on that entity you want to display. This would be an opportunity to use another fragment.

Now that the query is displaying results how I want, I can add another filter to show different content. In this case; a list of unpublished content.

Add another filter to see different results.

Instead of showing a list of articles with their user, I could rearrange this query to get all the articles for a given user.

Display reverse references with the same fragment.

I can reuse the same fragment to get the Article exactly as I had before, or edit that fragment to remove just the user info. The nodeQuery just changes to a userById which takes an id similar to how the nodeQuery can take a filter. Notice the reverseFieldAuthorNode. This allows us to get any content that references the user.

Up Next: Building a Simple GraphQL App

If you’re new to GraphQL, spend a little time learning how the query language works by practicing in the GraphiQL Explorer. In the next part of this post I will go over some more query examples, write a simple app with create-react-app and apollo, and explain how GraphQL can create and update content by writing a mutation plugin.

Categories: Drupal

Lullabot: A Toolset For Enterprise Content Inventories

Planet Drupal - 19 December 2018 - 10:06am

Earlier this year, Lullabot began a four-month-long content strategy engagement for the state of Georgia. The project would involve coming up with a migration plan from Drupal 7 to Drupal 8 for 85 of their state agency sites, with an eye towards a future where content can be more freely and accurately shared between sites. Our first step was to get a handle of all the content on their existing sites. How much content were we dealing with? How is it organized? What does it contain? In other words, we needed a content inventory. Each of these 85 sites was its own individual install of Drupal, with the largest containing almost 10K unique URLs, so this one was going to be a doozy. We hadn't really done a content strategy project of this scale before, and our existing toolset wasn't going to cut it, so I started doing some research to see what other tools might work. 

Open up any number of content strategy blogs and you will find an endless supply of articles explaining why content inventories are important, and templates for storing said content inventories. What you will find a distinct lack of is the how: how does the data get from your website to the spreadsheet for review? For smaller sites, manually compiling this data is reasonably straightforward, but once you get past a couple hundred pages, this is no longer realistic. In past Drupal projects, we have been able to use a dump of the routing table as a great starting point, but with 85 sites even this would be unmanageable. We quickly realized we were probably looking at a spider of some sort. What we needed was something that met the following criteria:

  • Flexible: We needed the ability to scan multiple domains into a single collection of URLs, as well as the ability to include and exclude URLs that met specific criteria. Additionally, we knew that there would be times when we might want to just grab a specific subset of information, be it by domain, site section, etc. We honestly weren't completely sure what all might come in handy, so we wanted some assurance that we would be able to flexibly get what we needed as the project moved forward.
  • Scalable: We are looking at hundreds of thousands of URLs across almost a hundred domains, and we knew we were almost certainly going to have to run it multiple times. A platform that charged per-URL was not going to cut it.
  • Repeatable: We knew this was going to be a learning process, and, as such, we were going to need to be able to run a scan, check it, and iterate. Any configuration should be saveable and cloneable, ideally in a format suitable for version control which would allow us to track our changes over time and more easily determine which changes influenced the scan in what ways. In a truly ideal scenario, it would be scriptable and able to be run from the command line.
  • Analysis: We wanted to be able to run a bulk analysis on the site’s content to find things like reading level, sentiment, and reading time. 

Some of the first tools I found were hosted solutions like Content Analysis Tool and DynoMapper. The problem is that these tools charge on a per-URL basis, and weren't going to have the level of repeatability and customization we needed. (This is not to say that these aren't fine tools, they just weren't what we were looking for in terms of this project.) We then began to architect our own tool, but we really didn't want to add the baggage of writing it onto an already hectic schedule. Thankfully, we were able to avoid that, and in the process discovered an incredibly rich set of tools for creating content inventories which have very quickly become an absolutely essential part of our toolkit. They are:

  • Screaming Frog SEO Spider: An incredibly flexible spidering application. 
  • URL Profiler: A content analysis tool which integrates well with the CSVs generated by Screaming Frog.
  • GoCSV: A robust command line tool created with the sole purpose of manipulating very large CSVs very quickly.

Let's look at each of these elements in greater detail, and see how they ended up fitting into the project.

Screaming Frog undefined

Screaming Frog is an SEO consulting company based in the UK. They also produce the Screaming Frog SEO Spider, an application which is available for both Mac and Windows. The SEO Spider has all the flexibility and configurability you would expect from such an application. You can very carefully control what you do and don’t crawl, and there are a number of ways to report the results of your crawl and export it to CSVs for further processing. I don’t intend to cover the product in depth. Instead, I’d like to focus on the elements which made it particularly useful for us.


A key feature in Screaming Frog is the ability to save both the results of a session and its configuration for future use. The results are important to save because Screaming Frog generates a lot of data, and you don’t necessarily know which slice of it you will need at any given time. Having the ability to reload the results and analyze them further is a huge benefit. Saving the configuration is key because it means that you can re-run the spider with the exact same configuration you used before, meaning your new results will be comparable to your last ones. 

Additionally, the newest version of the software allows you to run scans using a specific configuration from the command-line, opening up a wealth of possibilities for scripted and scheduled scans. This is a game-changer for situations like ours, where we might want to run a scan repeatedly across a number of specific properties, or set our clients up with the ability to automatically get a new scan every month or quarter.

Extraction undefined

As we explored what we wanted to get out of these scans, we realized that it would be really nice to be able to identify some Drupal-specific information (NID, content type) along with the more generic data you would normally get out of a spider. Originally, we had thought we would have to link the results of the scan back to Drupal’s menu table in order to extract that information. However, Screaming Frog offers the ability to extract information out of the HTML in a page based on XPath queries. Most standard Drupal themes include information about the node inside the CSS classes they create. For instance, here is a fairly standard Drupal body tag.

<body class="html not-front not-logged-in no-sidebars page-node page-node- page-node-68 node-type-basic-page">

As you can see, this class contains both the node’s ID and its content type, which means we were able to extract this data and include it in the results of our scan. The more we used this functionality, the more uses we found for it. For instance, it is often useful to be able to identify pages with problematic HTML early on in a project so you can get a handle on problems that are going to come up during migration. We were able to do things like count the number of times a tag was used within the content area, allowing us to identify pages with inline CSS or JavaScript which would have to be dealt with later.

We’ve only begun to scratch the surface of what we can do with this XPath extraction capability, and future projects will certainly see us dive into it more deeply. 

Analytics undefined

Another set of data you can bring into your scan is associated with information from Google Analytics. Once you authenticate through Screaming Frog, it will allow you to choose what properties and views you wish to retrieve, as well as what individual metrics to report within your result set. There is an enormous number of metrics available, from basics like PageViews and BounceRate to extended reporting on conversions, transactions, and ad clicks. Bringing this analytics information to bear during a content audit is the key to identifying which content is performing and why. Screaming Frog also has the ability to integrate with Google Search Console and SEO tools like Majestic, Ahrefs, and Moz.


Finally, Screaming Frog provides a straightforward yearly license fee with no upcharges based on the number of URLs scanned. This is not to say it is cheap, the cost is around $200 a year, but having it be predictable without worrying about how much we used it was key to making this part of the project work. 

URL Profiler undefined

The second piece of this puzzle is URL Profiler. Screaming Frog scans your sites and catalogs their URLs and metadata. URL Profiler analyzes the content which lives at these URLs and provides you with extended information about them. This is as simple as importing a CSV of URLs, choosing your options, and clicking Run. Once the run is done, you get back a spreadsheet which combines your original CSV with the data URL Profiler has put together. As you can see, it provides an extensive number of integrations, many of them SEO-focused. Many of these require extended subscriptions to be useful, however, the software itself provides a set of content quality metrics by checking the Readability box. These include

  • Reading Time
  • 10 most frequently used words on the page
  • Sentiment analysis (positive, negative, or neutral)
  • Dale-Chall reading ease score
  • Flesh-Kincaid reading ease score
  • Gunning-Fog estimation of years of education needed to understand the text
  • SMOG Index estimation of years of education needed to understand the text

While these algorithms need to be taken with a grain of salt, they provide very useful guidelines for the readability of your content, and in aggregate can be really useful as a broad overview of how you should improve. For instance, we were able to take this content and create graphs that ranked state agencies from least to most complex text, as well as by average read time. We could then take read time and compare it to "Time on Page" from Google Analytics to show whether or not people were actually reading those long pages. 

On the downside, URL Profiler isn't scriptable from the command-line the way Screaming Frog is. It is also more expensive, requiring a monthly subscription of around $40 a month rather than a single yearly fee. Nevertheless, it is an extremely useful tool which has earned a permanent place in our toolbox. 


One of the first things we noticed when we ran Screaming Frog on the Georgia state agency sites was that they had a lot of PDFs. In fact, they had more PDFs than they had HTML pages. We really needed an easy way to strip those rows out of the CSVs before we ran them through URL Profiler because URL Profiler won’t analyze downloadable files like PDFs or Word documents. We also had other things we wanted to be able to do. For instance, we saw some utility in being able to split the scan out into separate CSVs by content type, or state agency, or response code, or who knows what else! Once again I started architecting a tool to generate these sets of data, and once again it turned out I didn't have to.

GoCSV is an open source command-line tool that was created with the sole purpose of performantly manipulating large CSVs. The documentation goes into these options in great detail, but one of the most useful functions we found was a filter that allows you to generate a new subset of data based on the values in one of the CSV’s cells. This allowed us to create extensive shell scripts to generate a wide variety of data sets from the single monolithic scan of all the state agencies in a repeatable and predictable way. Every time we did a new scan of all the sites, we could, with just a few keystrokes, generate a whole new set of CSVs which broke this data into subsets that were just documents and just HTML, and then for each of those subsets, break them down further by domain, content type, response code, and pre-defined verticals. This script would run in under 60 seconds, despite the fact that the complete CSV had over 150,000 rows. 

Another use case we found for GoCSV was to create pre-formatted spreadsheets for content audits. These large-scale inventories are useful, but when it comes to digging in and doing a content audit, there’s just way more information than is needed. There were also a variety of columns that we wanted to add for things like workflow tracking and keep/kill/combine decisions which weren't present in the original CSVs. Once again, we were able to create a shell script which allowed us to take the CSVs by domain and generate new versions that contained only the information we needed and added the new columns we wanted. 

What It Got Us

Having put this toolset together, we were able to get some really valuable insights into the content we were dealing with. For instance, by having an easy way to separate the downloadable documents from HTML pages, and then even further break those results down by agency, we were able to produce a chart which showed the agencies that relied particularly heavily on PDFs. This is really useful information to have as Georgia’s Digital Services team guides these agencies through their content audits. 


One of the things that URL Profiler brought into play was the number of words on every page in a site. Here again, we were able to take this information, cut out the downloadable documents, and take an average across just the HTML pages for each domain. This showed us which agencies tended to cram more content into single pages rather than spreading it around into more focused ones. This is also useful information to have on hand during a content audit because it indicates that you may want to prioritize figuring out how to split up content for these specific agencies.


Finally, after running our scans, I noticed that for some agencies, the amount of published content they had in Drupal was much higher than what our scan had found. We were able to put together the two sets of data and figure out that some agencies had been simply removing links to old content like events or job postings, but never archiving it or removing it. These stranded nodes were still available to the public and indexed by Google, but contained woefully outdated information. Without spidering the site, we may not have found this problem until much later in the process. 

Looking Forward

Using Screaming Frog, URL Profiler, and GoCSV in combination, we were able to put together a pipeline for generating large-scale content inventories that was repeatable and predictable. This was a huge boon not just for the State of Georgia and other clients, but also for Lullabot itself as we embark on our own website re-design and content strategy. Amazingly enough, we just scratched the surface in our usage of these products and this article just scratches the surface of what we learned and implemented. Stay tuned for more articles that will dive more deeply into different aspects of what we learned, and highlight more tips and tricks that make generating inventories easier and much more useful. 

Categories: Drupal

Bulk mail

New Drupal Modules - 19 December 2018 - 9:52am

Bulk Email module provides site administrators an interface to send mass email in easy and quick way.
In order to use the modules like "Views bulk operation" or "views send" module, email addresses should be entities in drupal system but not the such case with this module.
Use case - Lets suppose, you have long list of emails in text file and you want to send email to all email addresses. In that case this module will be helpful.
How to use this module

Categories: Drupal

Migrate Process URL

New Drupal Modules - 19 December 2018 - 9:52am

A custom migration may only have a URL to import into a field. If using Drupal core's link field, you have to assign the value directly to the uri column:

field_my_website/uri: source: my_custom_url_source

The problem is, sometimes the source field will not be in the correct URL format for Drupal core. This module provides a new process plugin, field_link_generate the creates an array for use with the field_link process plugin:

Categories: Drupal

Jeff Geerling's Blog: Hosted Apache Solr now supports Drupal Search API 8.x-2.x, Solr 7.x

Planet Drupal - 19 December 2018 - 8:50am

Earlier this year, I completely revamped Hosted Apache Solr's architecture, making it more resilient, more scalable, and better able to support having different Solr versions and configurations per customer.

Today I'm happy to officially announce support for Solr 7.x (in addition to 4.x). This means that no matter what version of Drupal you're on (6, 7, or 8), and no matter what Solr module/version you use (Apache Solr Search or Search API Solr 1.x or 2.x branches), Hosted Apache Solr is optimized for your Drupal search!

Categories: Drupal

CKEditor Resize

New Drupal Modules - 19 December 2018 - 8:30am

This module integrates the resize CKEditor plugin for Drupal 8.

This plugin allows you to resize the classic editor instance by dragging the resize handle (◢)
located in the bottom right (or bottom left in the Right-to-Left mode) corner of the editor.
It can be configured to make the editor resizable only in one direction (horizontally, vertically)
or in both.

Categories: Drupal


Subscribe to As If Productions aggregator - Drupal