Data Engineering with Alteryx

I was really fortunate to be one of the few to receive an early preview of this awesome book before it was released in July of this year. What I love about this book, is that it pulls together a lot of different techniques around the topic of data engineering. It also takes those techniques from a beginner view point right through to the advanced concepts, with great examples along the way. It teaches Alteryx users DataOps and it also helps engineers learn Alteryx, so whoever you are you can really gain a lot from this book.

Why is this book needed in the marketplace and what has Paul brought to the table?

Alteryx users tend to be people who are data driven and then end up becoming Data Engineers alongside their day job, or Alteryx grows within the business that Data Engineering departments are born to enable the growth of data within the organisation. Paul, has clearly seen both scenarios take place over his years of using and sharing Alteryx and really thought about all the techniques which could help individuals and teams to really deliver great results within the platform.

It’s not just one book it’s THREE in ONE!

Data Engineering with Alteryx has been split into three parts, Introduction, Functional Steps in DataOps and Governance of DataOps.

Of course, Paul introduces the reader to Alteryx Designer, Server and Connect, but what is great is he also introduces InDB tools within the first few pages. This is an area of Alteryx which isn’t utilised enough! The more Alteryx users understand what these tools bring to the table the better. Utilising the power of your own databases before the data even hits Alteryx, your database servers are often power houses of processing power and memory, the InDB tools put this power in your hands.

He then goes to define a data engineer, at the time I was leading an data engineering team within Avon and boy he does nail that definition.

“As a data engineer, you are an enabler.”

Paul Houghton

We were enablers, we drove change, made people’s lives easier and enabled the business to do things they were not able to do without the help of Alteryx and a proactive team of engineers. This simple sentence made me smile.

What I really enjoyed about the introduction section is that Paul really took the time to explain DataOps and putting that People pillar first is absolutely key. I tend to think of a more traditional approach of People, Process and Technology, which is a very systematic view; with the use of Agile methodology as a method of delivering processes and technology. Paul really focuses on running a data engineering team and the elements needed here around People, Delivery and Confidence. It was a great read.

Now for the practical part, the best part of Alteryx is that its practical hands on learning.

Part two of the book covers functional steps of building data pipelines and delivering them to databases as well as advanced analytic processes such as spatial analytics and machine learning models.

With graphical images and clear descriptions Paul guides the user through the latest version of Alteryx on how to get the most from their data connections and even details on what an API is and how to utilise them for data mining.

The simple stuff: most users don’t have a computer science background so the fact that Paul describes concepts like UNC paths and relative paths and why we use them it’s great to see what a conscientious writer he is. Great job Paul. πŸ˜‰

Throughout this whole section there are great examples, guides, graphics, key terms, code snippets, etc. that would guide both the beginner and the expert through key features that are useful in data engineering.

Now for the final section of the book, part three, Governance of DataOps and where the real gems are.

It takes a while to discover some of these tools and techniques mentioned here for most users, but its great that they are the first things that Paul brings to light. He also pays great respect to the CREW macros, developed by Adam Riley of Alteryx but loved and looked after by Mark Frisch (MarqueeCReW), Alteryx ACE, community leader and all-round good egg.

There is a huge amount of detail around the Alteryx Usage Report (how to set it up), accessing the data within the MongoDB (database of Alteryx Server), using Git and GitHub actions for development (continuous integration) all of which are very topical right now within organisations.

This section enables Alteryx users to be able to develop into production. It also helps you understand how users are utilising the server or apps, to redevelop or stop maintenance on items which are no longer required, review errors to redevelop. For those new to Alteryx Server, there is a whole chapter that explains governance and setting it up that will make it a breeze to getting started. This book opens up all the fundamentals for users to be able to get the most of Alteryx within the field of Data Engineering. I particularly loved the section on Git and validating XML with python. (But that is the developer in me πŸ˜€)


Overall, this is a book I would buy and have on my shelf to share with those I am introducing to the world of Alteryx, as a kick starter guide to developing data in the Alteryx platform. It is a perfect book for those starting out and those wanting to become more confident across the whole platform who may have only had exposure to Alteryx Designer before Alteryx Server.

You need a copy of this book if you are:

  • Leading an Alteryx team – for development of your team
  • Alteryx Champion or a budding Alteryx Champion – for development of your platform knowledge
  • Brand new Alteryx user – for development of yourself

If you like giving geeky gifts, this will be gratefully received by all who receive it. You have my permission. πŸ˜‰


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s