Search for:
How To Automate PDF Data Extraction – 3 Different Methods To Parse PDFs For Analytics


If you work in data, then at some point in your career, you’ll likely need to parse data from a PDF. You might need to parse thousands of PDFs in order to pull out invoice information. Or maybe you need to parse financial filing documents such as 10-Ks. This can seem challenging at first. Afterall,…
Read more

The post How To Automate PDF Data Extraction – 3 Different Methods To Parse PDFs For Analytics appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

How To Modernize Your Data Strategy And Infrastructure For 2025


We are still in the early days of data and the value it can add to companies. You’ll read plenty of statistics about how much value data can drive and how far behind companies that aren’t using data are. And as a data consultant, I have helped companies find that value in their data. It…
Read more

The post How To Modernize Your Data Strategy And Infrastructure For 2025 appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Real-time Analytics Vs Stream Processing – What Is The Difference?


One of the holy grails that many data teams seem to chase is real-time data analytics. After all, if you can have real-time analytics, you can make better decisions faster. However, there often is a conflation between real-time data analytics and stream processing.  These are two different concepts that are crucial to understanding how to…
Read more

The post Real-time Analytics Vs Stream Processing – What Is The Difference? appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Essential Skills for Data Engineers in the Age of AI


If you work in data, then AI is everywhere at this point.  But whether AI is hype or reality doesn’t change the fact that data engineers will play a major role in ensuring that the data sets that are utilized for the growing use cases are usable both by machines and humans. Whether that data…
Read more

The post Essential Skills for Data Engineers in the Age of AI appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

How To Run A Data Team As A New Head Of Data


What would you do if you became the head or director of data for a 1,000-person company? Yesterday, you were plugging along as an analyst, and now, suddenly, you have all these new responsibilities. Figuring out where to start is part of the job. You’d probably feel a strong temptation to freak out. Who wouldn’t?…
Read more

The post How To Run A Data Team As A New Head Of Data appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

9 Habits Of Effective Data Managers – Running A Data Team


Running a successful data team is hard. Data teams are expected to juggle a combination of ad-hoc requests, big bet projects, migrations, etc. All while keeping up with the latest changes in technology. In the past few years I have gotten to work with dozens of teams and see how various directors and managers deal…
Read more

The post 9 Habits Of Effective Data Managers – Running A Data Team appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

How To Data Model – Real Life Examples Of How Companies Model Their Data


How companies data model varies widely. They might say they use Kimball dimensional modeling. However, when you look in their data warehouse the only part you recognize is the word fact and dim. Over the past near decade, I have worked for and with different companies that have used various methods to capture this data.…
Read more

The post How To Data Model – Real Life Examples Of How Companies Model Their Data appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Why Data Analysts And Engineers Make Great Consultants


Many data engineers and analysts don’t realize how valuable the knowledge they have is. They’ve spent hours upon hours learning SQL, Python, how to properly analyze data, build data warehouses, and understand the differences between eight different ETL solutions. Even what they might think is basic knowledge could be worth $10,000 to $100,000+ for a…
Read more

The post Why Data Analysts And Engineers Make Great Consultants appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

4 ELT Alternatives To Airbyte – How To Ingest Your Data


Getting data out of source systems and into a data warehouse or data lake is one of the first steps in making it usable by analysts and data scientists. The question is how will your team do that? Will they write custom data connectors, pay for a data connector out of the box or perhaps…
Read more

The post 4 ELT Alternatives To Airbyte – How To Ingest Your Data appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Terms You Should Know If You’re Planning To Use Change Data Capture


If you’ve worked in data long enough, then you’ve likely come across the term change data capture. Often called CDC, change data capture involves tracking and recording changes in a database as they happen, and then transmitting these changes to designated targets. This can be crucial because some pipelines, in particular batch pipelines, don’t capture…
Read more

The post Terms You Should Know If You’re Planning To Use Change Data Capture appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Apache Spark Vs Apache Flink – What Is The Difference?


As data increased in volume, velocity, and variety, so, in turn, did the need for tools that could help process and manage those larger data sets coming at us at ever faster speeds. As a result, frameworks such as Apache Spark and Apache Flink became popular due to their abilities to handle big data processing…
Read more

The post Apache Spark Vs Apache Flink – What Is The Difference? appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

10 Great Videos To Help You Learn Data Engineering


How data is structured, managed and processed will continue to grow in importance as the demand for AI and machine learning increase. It’s unavoidable that as businesses demand that their data teams implement AI, they will also realize that data engineers are a crucial piece of the data pipeline. That means, if you’re looking for…
Read more

The post 10 Great Videos To Help You Learn Data Engineering appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Common Pitfalls of Data Analytics Projects


Have you ever been part of a data or software project that seems stuck in a loop? Three weeks have passed, and although you arrive at work daily, exhausted, having tackled numerous issues, the project remains stagnant. Why? Then, suddenly, a new engineer or project manager steps in, reorganizes and prioritizes tasks, and just like…
Read more

The post Common Pitfalls of Data Analytics Projects appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Apache Druid’s Architecture – How Druid Processes Data In Real Time At Scale


Recently, I wrote an article diving into what Druid is and which companies are using it. Now I wanted to do a deeper dive into Apache Druid’s architecture. Apache Druid has several unique features that allow it to be used as a real-time OLAP. Everything from its various nodes and processes that each have unique…
Read more

The post Apache Druid’s Architecture – How Druid Processes Data In Real Time At Scale appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

5 Real-Time Data Processing and Analytics Technologies – And Where You Can Implement Them


No matter your industry, you’ll often need to make split-second business decisions in the digital age. Real-time data can help you do just that. It’s information that’s made available as soon as it’s created, meaning you don’t need to wait around for the insights you need. Real-time data processing can satisfy the ever-increasing demand for…
Read more

The post 5 Real-Time Data Processing and Analytics Technologies – And Where You Can Implement Them appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Alternatives to SSIS(SQL Server Integration Services) – How To Migrate Away From SSIS


SQL Server Integration Services (SSIS) comes with a lot of functionality useful for extracting, transforming, and loading data. It can also play important roles in application development and other projects. But SSIS is far from the only platform that can provide these services. You might seek alternatives to SSIS because you want a more agile…
Read more

The post Alternatives to SSIS(SQL Server Integration Services) – How To Migrate Away From SSIS appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Why Your Team Needs To Implement Data Quality For Your AI Strategy


Companies that range from start-ups to enterprises are looking to implement AI and ML into their data strategy. With that it’s important not to forget about data quality. Regardless of how fancy or sophisticated a company’s AI model might be, poor data quality will break it. It will make the outputs of these models useless…
Read more

The post Why Your Team Needs To Implement Data Quality For Your AI Strategy appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Data Warehousing Essentials: A Guide To Data Warehousing


Photo by Tiger Lily Data warehouses and data lakes play a crucial role for many businesses. It gives businesses access to the data from all of their various systems. As well as often integrating data so that end-users can answer business critical questions. But if we take a step back and only focus on the…
Read more

The post Data Warehousing Essentials: A Guide To Data Warehousing appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Cutting Your Data Stack Costs: How To Approach It And Common Issues


I once had an engineer tell me that they essentially didn’t want to consider cost as they were building a solution. I was baffled. Don’t get me wrong, yes, when you’re building, you iterate and aim to improve your solutions cost. But from my perspective, I don’t think completely ignoring costs from day one is…
Read more

The post Cutting Your Data Stack Costs: How To Approach It And Common Issues appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

7 Great Embedded Analytics Solutions – Which Embedded Analytics Solutions Should You Use?


Big data is big business these days. Organizations that hope to get ahead in crowded markets must utilize data from a variety of often highly disparate sources to understand how they’re performing and what customers are saying about them. However, data without the right analysis and reporting tools is just a waste of digital storage…
Read more

The post 7 Great Embedded Analytics Solutions – Which Embedded Analytics Solutions Should You Use? appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

How To Plan To Data Roadmap For 2024 – Elevating Your Data Strategy


It’s that time of year again. When data leaders, VPs and Directors need to start planning out their data roadmap. Of course, this brings up an important question, how should you start planning out your data roadmap? Especially if you’re data team has found itself stuck in the data service trap. Simply providing data and…
Read more

The post How To Plan To Data Roadmap For 2024 – Elevating Your Data Strategy appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Finding The Right ETL/ELT Solution – What Is Estuary And Should You Use It?


Data warehousing would be easy if all data were structured and formatted in the data source. Maybe we wouldn’t even need to build a data warehouse. But as anyone who has worked with data from more than one source knows, that’s rarely the case. Businesses today need to pull data from a plethora of sources,…
Read more

The post Finding The Right ETL/ELT Solution – What Is Estuary And Should You Use It? appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Common Pitfalls in Deploying Airflow for Data Teams


If you’re a data engineer, then you’ve likely at least heard of Airflow. Apache Airflow is one of the most popular open-source workflow orchestration solutions that gets used for data pipelines. This is what spurred me to write the article “Should You Use Airflow” because there are plenty of people who don’t enjoy Airflow or…
Read more

The post Common Pitfalls in Deploying Airflow for Data Teams appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Apache Druid: Who’s Using It and Why?


Image Source: Druid The past few decades have increased the need for faster data. Some of the catalysts were the push for better data and decisions to be made around advertising. In fact, Adtech has driven much of the real-time data technologies that we have today. For example, Reddit uses a real-time database to provide…
Read more

The post Apache Druid: Who’s Using It and Why? appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

How To Set Up Your Data Analytics Team For Success – Centralized vs Decentralized vs Federated Data Teams


Success in the data world hinges on team setup. I’ve delved into onboarding and standards in previous articles, but never into the structure of data teams. Typically, there are three configurations: Centralized, Decentralized, and Federated. Most companies I’ve seen use a mix of these. While the newest tech breakthroughs grab headlines, team organization is the…
Read more

The post How To Set Up Your Data Analytics Team For Success – Centralized vs Decentralized vs Federated Data Teams appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com