Search for:
5 Fivetran Use Cases – Examples Of How Companies Are Implementing Fivetran Into Their Modern Data Stack


gBig data is the new commodity that companies need to drive the business forward. Having access to all your companies data sets used to be limited to large companies that could afford a whole team of data engineers and software developers. But this is no longer the case. Traditionally you would need to have a…
Read more

The post 5 Fivetran Use Cases – Examples Of How Companies Are Implementing Fivetran Into Their Modern Data Stack appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Location data collection firm admits privacy breach

A British firm that sells people’s location data has admitted that some of its information was gained without seeking permission from users.

Huq uses location data from apps on people’s phones and sells it to clients, which include dozens of English and Scottish city councils. It told the BBC that in two cases, its app partners had not asked for consent from users.

But it added that the issue had now been rectified. In a statement, the firm said it was aware of two “technical breaches” of data privacy requirements. But it added that it had asked both to “rectify their code and republish their apps”, which they had done.

Read More

Affinity Analytics using Actian Avalanche Cloud Data Warehouse

Affinity analytics is the practice of finding relationships and patterns in data.   Businesses can use the results from affinity analytics for many positive impacts.   Here are just two examples from real customer use cases.  First, in retail, management wants to know what products typically sell well together for product placement and advertising purposes. This information is critical to successfully upsell additional products. Another example, telecommunications providers need to study network traffic data to understand routing patterns and maximize equipment and topography. Like these use cases, your business likey has occurrences of data affinity that you can harness to make better business decisions. Actian Avalanche provides the data warehouse platform to help you do it.

Despite being clearly useful, affinity is difficult to find in traditional data warehouses because it involves executing one of the most difficult, resource-intensive SQL statements known, the fact-table self-join (also known as a “market-basket” query).   This query is difficult because data warehouse “fact” tables often contain billions of rows (like mine does here), and joining billions of rows back to themselves to find affinity takes a lot of processing power. In fact, some platforms can’t do it at all, or it takes so long it’s not usable. That is where the power of the Actian Avalanche Data Warehouse shines.

In this blog, I discuss how to successfully achieve affinity analytics using solely the built-in functionality of the Actian Avalanche Cloud Data Warehouse, with no other tooling required!

Actian Avalanche provides industry-leading cloud analytics, purpose-built for high performance.   What I will show here is that Actian Avalanche – natively – provides the necessary tooling to accomplish SQL analytics, allowing you can achieve things like affinity analytics without having to embark on giant, expensive projects involving additional third-party tooling.

Here is my scenario:

I have a retail data warehouse. Marketing wants to plan an outreach mail campaign to promote sales of products that typically sell well with the store’s best-selling products.  In particular, they want to mail coupons to customers that have NOT bought products that are normally bought together, but HAVE purchased at least one of the best-selling products. They would like me to provide data to support this campaign.

My analytics process will be as follows:

  1. Investigate the data
  2. Find best-selling products (A)
  3. Find products commonly sold with top products (B)
  4. Find the customer population who bought A but not B
  5. Provide appropriate information to marketing

For this blog, I have created an 8 AU (Avalanche Unit) warehouse in the Google Cloud Platform.  An Avalanche Unit is measure of cloud computing power that can be scaled up or down. See Figure 1.

Figure 1: Avalanche console warehouse definition

Figure 1: Avalanche console warehouse definition

My Actian Avalanche database has a typical retail schema, but for this blog, I will just focus on four tables.  See Figure 2.

Figure 2: Retail ER diagram

I have used a data generator to generate a large amount of data, but I’ve added some artificially superimposed patterns to make this blog more interesting. My tables have the following number of rows in them:

customer 5,182,631
order 1,421,706,929
lineitem 45,622,951,425
product 16,424

 

I can now use the tools provided in the Avalanche console Query Editor to execute my analytics process. You can find the Query Editor in the top right corner of the warehouse definition page. I have circled it in blue in Figure 1.

For all the queries in this blog, I performed the following sequence: I put my query into the query editor pane (1), formatted the query (optional) (2), then executed the query (3), then saved the query (4) for future reference.  See sequence layout in Figure 3. Notice that you can also see the layout of my entire schema (red circle) in the Query Editor.

Figure 3: Query Editor layout

Investigate the data

First, I want to understand my data by executing a few interesting queries.

I want to understand what months of data are in my Avalanche warehouse and understand some overall numbers.  (Note this blog was authored in early 2021).  I execute this query:

Figure 4: Line item statistics

Because of the speed of Avalanche, in just a few seconds, I gleaned some valuable information from my warehouse. It looks like I have five years’ worth of data including over 45 billion line items sold, showing an average sale of $625.  That’s terrific! See Figure 4.

Also, I would like to see trended sales by month. I execute this query:

Figure 5: Trended sales

This query also finished in just a few seconds, but with all these big numbers, it’s a little hard to grasp their relative values. It will be helpful to make a chart using the Avalanche Query Editor’s charting function.

I’ve used the charting function (see Figure 6) to create a bar chart. I’m running the same query essentially, but I’ve simplified it and limited the output to just last year. It’s easy to see now, that my sales really accelerated around Christmas. I’ve shown how I configured this chart in Figure 7.

Figure 6: Trended sales with chart

Figure 7: Chart configuration

Find best-selling products (A)

Now that I understand my data, I execute this query to find the best-selling product categories by spend in the last year:

affinity analytics

Figure 8: Top categories by spend

In just a few seconds, I learn that Clothing and Electronics were my best-selling product categories overall. I know that marketing always likes to work with Electronics, so I’m going to concentrate there.

Next, I want to find the top-selling products in Electronics last year. I execute this query:

Figure 9: Top products in Electronics

Again, because of the speed of Actian Avalanche, in a few seconds, I learn that many of the top products in my Electronics category are Canon products.  See Figure 9.

Find products commonly sold with top products (B)

Now I want to find the Electronics products that are most often sold with these top-selling Canon products in the last six months.   This is the resource-intensive market-basket query that I referred to in my introduction.   To execute, this query will join my 45 billion line items back to the same 45 billion line items to see which items are typically bought together.  I execute this query:

affinity analytics

Figure 10: Market-basket query

This query is much more complex than the previous queries, still, it only took a mere 17 seconds to execute in Actian Avalanche. It is obvious from this query that Canon customers often buy SDHC Memory Cards of different types. This is something that seems logical, of course, but I have now proven this with analytics.

Find the customer population who bought A but not B.

Now I need to find the names and addresses of customers who have NOT bought memory cards. This is basically a reverse market-basket query. Avalanche will join the 45 billion row line item table back to itself, this time to find missing relationships…customers who have not bought memory cards. It then also needs to join the line item and order information back to the customer table to get the corresponding name and address information. Also, I need to make sure I don’t send duplicate mailings to any customer that may have bought multiple Canon products, so I have added the DISTINCT keyword to my SQL. I execute the query below. Once it is finished, I then choose the .csv download option to create an output file.  See the red circles in Figure 11.

Figure 11: Reverse market-basket.  No affinity.

Provide appropriate information to marketing.

I can now easily mail the .csv file of prospect customers to market so they can send out their marketing mail campaign.

Figure 12: Email with target list

In conclusion, the Actian Avalanche Data Warehouse is a very powerful cloud data warehouse platform that also includes the basic tools and speed you need to be productive with affinity analytics in your business.

You can download a full-function free trial of Avalanche and see what affinities you can find in your own data!

The post Affinity Analytics using Actian Avalanche Cloud Data Warehouse appeared first on Actian.


Read More
Author:

Mary Schulte
Data Warehouse Best Practices

In every industry, the need to follow best practices exists. Data warehouse best practices are no exception. Best practices are methods or techniques excepted as a good way or best way to accomplish an activity, process, or practice. All practices evolve, but the best way to start is with a foundation of best practices then adapt those practices to meet the specific needs of an organization. Organizations that continually evolve their best practices based on industry, customer, and internal feedback will create unique best practices resulting in a strategic, tactical, or operational advantage over a similar organization serving the same markets.

Best practices enable assets, capabilities, and resources to deliver value to the organization, stakeholders, and customers. A data warehouse can be a strategic resource for any organization. Developing a data warehouse practice into a unique capability requires making the data warehouse best meet the organizational objectives that the data warehouse technology supports. 

Data Warehouse Best Practices

Data within an organization is sometimes not leveraged as much as it can be. Many organizations find themselves making decisions using best effort or expert opinions in most cases. These decisions can become more powerful and meaningful when backed with intelligent data, information, and knowledge relative to the needs of data consumers. To do this, organizations have to work as a team and remove as many silos as possible related to the services and products they deliver and support. Data exchanges between customers and all the functional units in the organization help make this happen.

Data Warehouse Best Practices

Organizations rely on many best practices in various functions to perform as efficiently as possible. There are best practices for managing people, methods, processes, and technologies. Listed below are several best practices and data warehouse considerations that should be adopted within an organization to help enable value from a data warehouse:

  • Identify what decisions need to be made within each functional unit of the organization and how data supports their conclusions. Data should have a purpose. Data collected that does not have a goal is a waste of the organization’s precious resources. The organization must be efficient and effective with data collection, including exchanging data between functional units, transforming data into information, and then shifting it into knowledge for decision support.
  • Create models. Service models, product models, financial models, and process models help organizations understand data exchanges and the data needed by different stakeholders to define and architect the data warehouse data model. The data model helps the organization understand the value chains between different data consumers and how data should be presented.
  • Understand decisions that need to be made by each consumer of the data in the data warehouse. Analyze and understand data needs for each consumer of the data.
  • Decide governance, risk, and compliance (GRC) policies, processes, and procedures. Managing data is very important to any organization and should be treated with utmost care and responsibility. Data activities within the organization have to operate efficiently, effectively, and economically to avoid risk and resource waste.
  • Decide the type of initial data warehouse. Decide whether a 1-, 2-or 3-tier architecture is the best initial approach for your data warehouse. Remember, data warehouses support analytical processing and are not suitable for transactional processing.
  • Databases connected to cloudDecide if the data warehouse should be on-premise, cloud or hybrid. This includes understanding the budget available for the overall initial program/project and its impact on the decision.
  • Decide initial sources for input into a data warehouse. Remember, data sources can grow over time as the needs of the organization grow. It is essential to make sure adding new data sources is easy to accomplish.
  • Create a project plan to break up the delivery approach into manageable units for showing value quickly using the data warehouse. Don’t try to be perfect with a never-ending project or try a big-bang approach. Show value as soon as possible. Be agile, get good enough and plan for continuous improvement.
  • Decide data warehouse needs for availability, capacity, security, and continuity. The data warehouse has to be available when needed, have enough capacity to support demand, secure and maintain levels of confidentiality, integrity, and be available to those who need it. For continuity, the data warehouse should be included in business impact analysis and risk assessment planning. Usability and performance are also considerations for data warehouse warranties to its consumers.
  • Decide how often data needs to be loaded and reconciled, based on timeliness and relevance of data change, from data warehouse sources for decisions. Use Extract, Transform and Load (ETL) to help migrate data between sources and destinations. Data warehouse staging is a best practice to help stage data on a regular schedule for data warehouse decision needs.
  • Setup data for reporting, analytics, and business intelligence. Data warehouse reporting best practices have to be enabled for ease of use by data consumers. The consumer should be able to create dynamic reports with ease from the data warehouse quickly.
  • Follow agile best practices for change, release, and deployment management to help reduce risks and increase knowledge transfer. These best practices should integrate and align with other best practices in the organization.
  • Make sure to hire experienced people who are experts in data warehouse planning, design, and implementation. Bringing the right team together is one of the most important best practices for datawarehouse design, development and implementation. No matter how good the technology is, the overall results will be disappointing without the right people. Project managers, business analysts, data analysts, data engineers, data architects, security analysts, and knowledge managers are key roles that can help with a successful data warehouse.

Business Intelligence DashboardBest practices in business intelligence and data warehousing go hand in hand. The better the data warehouse technical infrastructure, the better the organization can collect, store, analyze, and present the data for consumer intelligence. Organizations have to be careful of insufficient data quality resulting in bad data for business intelligence. The data warehouse should easily support tools or applications that need the data for business intelligence. Reporting, data mining, process analysis, performance benchmarking, and analytical tools support business intelligence and should be quickly implemented without creating homegrown solutions for the data warehouse.

In Summary

This blog has discussed many data warehouse best practices. Depending on the organization and challenges they have experienced, more best practices can be added to the ones listed above. Best practices can come from anywhere in the organization based on experiences, challenges, and the overall market dynamics that the organization faces. Data warehouses and enterprise data hubs are fast becoming a strategic component for many organizations. Since a data warehouse is a large project that will mature over time, it should become a major program in the organization. Data is the blood that runs through the organization; this will not change. Data management will advance with emerging technologies, but the purpose will remain to help the organization make better informed and more timely decisions. Make a plan to start or improve your data warehouse outcomes by using best practices and selecting the right partners, technologies, and software to aid your journey.

Actian Avalanche™ combines one of the industry’s fastest hybrid-cloud data warehouses with self-service data integration in the cloud to create better customer insights. With an easy, drag-and-drop interface, Avalanche empowers anyone in the organization – from data scientists to citizen integrators – to easily combine, clean, and analyze customer data from any source, in any location.

The post Data Warehouse Best Practices appeared first on Actian.


Read More
Author:

Teresa Wingfield
Digital transformation depends on effective customer data management

There has been an incredible amount of technological change that impacts our everyday lives. The advent of the internet, social media, mobile and more, have changed how customers interact with brands, hence the need for change from brands.

It is the need for that change, that drives effective digital transformation.  

Digital transformation can mean different things for different organizations. It can mean launching e-commerce or a mobile app. For some, it could be about improving the web experience.

Read more

What is an Enterprise Data Hub?

When managing big data, organizations will find that there will be many consumers of the vast amounts of data, ranging from applications and data repositories to humans via various analytics and reporting tools. After all, the data is an expression of the enterprise, and with digital transformation, that enterprise is increasingly expressed in the form of applications, data and services delivered. Data that is structured, unstructured, and in various formats become sources and destinations of exchanges between functional units in the organization that are no longer just done manually or with middleware but can now be hosted collaboratively utilizing data lakes, data warehouses, and enterprise data hub technologies.

The choice of which data management solution to use depends on the organization’s needs, capabilities, and the set of use cases. In many organizations, particularly large or complex ones, there is a need for all three technologies. Organizations would benefit from understanding each solution and how the solution can add value to the business, including how each solution can mature into a more comprehensive higher-performing solution for the entire organization. 

What is Enterprise Data Hub?

An Enterprise data hub helps organizations manage data directly involved – “in-line” –  to the various business processes, unlike data warehouses or data lakes, as they are more likely to be used to analyze data prior to or after use by various applications. Organizations can better govern data consumption by applications across the enterprise by passing it through an Enterprise data hub. Data lakes, data warehouses, legacy databases, and data from other sources such as enterprise reporting systems can contribute to governed data that the business needs.

Besides data governance protection, an enterprise data hub also has the following features:

  • Ability to make use of search engines for enterprise data. The enablement of search engines acts as filters to allow quick access to the enormous amounts of data available with an enterprise data hub.
  • Data Indexing to enable faster searches of data.
  • Data Harmonization enhances the quality and relevance of data for each consumer or data, including improving the transformation of data to information and information to knowledge for decision making.
  • Data integrity, removing duplication, errors, and other data quality issues related to improving and optimizing its use by applications.
  • Stream processing binds applications with data analytics, including simplifying data relationships within the enterprise data hub.
  • Data Exploration increases the understanding and ease of navigating the vast amount of data in the data hub.
  • Improved Batch, Artificial Intelligence, Machine Learning processing of data because of the features listed above.
  • Data Storage consolidation from many different data sources.
  • Direct consumer usage or application usage for further processing or immediate business decisions.

Enterprise data hubs can support the rapid growth of data usage in an organization. The flexibility in using multiple and disparate data sources is a massive benefit of selecting a data hub. Leveraging the features mentioned above increases this benefit.

Difference Between Enterprise Data Hub, Data Lake, and Data Warehouse

Data Lakes are centralized repositories of unorganized structured, and unstructured data with no governance and specifications for organizational needs. The primary purpose of a data lake is to store data for later usage though many data lakes have developer tools that support mining the data for various forward-looking research projects.

A Data Warehouse organizes the stored data in a prescribed fashion for everyday operational uses, unlike a data lake. Data Warehouses can be multitiered to stage data, transform data and reconcile data for usage in data marts for various applications and consumers of the data. A data warehouse is not as optimized for transactional day-to-day business needs as an enterprise data hub.

In addition to drawing data from and pushing data to various enterprise applications, an Enterprise data hub can use a data lake, data warehouse, and other data sources as input into or as destinations from the data hub. Once all the data is available for the hub, the aforementioned features, such as governance, can be applied to the data. Enterprise data hub vs data lake can be easily differentiated based on the data hub’s additional capabilities for processing and enriching the enterprise data. Enterprise data hub vs data warehouse can be confusing, but the data hub has additional capabilities for using the data more business process-oriented rather than business analytics-oriented operations.

Enterprise Data Hub Architecture

The following diagram shows an Enterprise data hub architecture that includes multiple data sources, the hub itself, and the data consumers.

Enterprise Data Hub

The Enterprise data hub Architecture is designed for the most current needs of organizations. The architecture itself can grow to accommodate other data management needs, such as the usage of data in emerging technologies for decision support and business intelligence.

In Summary

With the increasing adoption of disparate data and Big Data practices, Enterprise data hubs are becoming the architectures to create a unified data integrated system to enable better business processes across the enterprise. Enterprise data hub can utilize data for any source and type to create a single source of data truth about the organization’s customer, service, and products. This single source of truth can be used collaboratively across the organization to share data for timely, higher-performing business operations, automation, and decision-making.

Organizations with data hubs and supporting data sources can become more competitive than those that do not. Data is the lifeblood of the organization that enables optimized and automated business processes and decision support for organizations to make better decisions. This capability is well worth the time and investment for the organization.

Actian can help you with your cloud data integration challenges. Actian DataConnect is a hybrid integration solution that enables you to quickly and easily design, deploy, and manage integrations on premise, in the cloud, or hybrid environments.

The post What is an Enterprise Data Hub? appeared first on Actian.


Read More
Author:

Lewis Carr
Why Migrate To The Modern Data Stack And Where To Start


Photo by Leohoho on Unsplash Migrating to the modern data stack means poising your organization to meet the varying demands of modern data. The modern data stack seeks to overcome the problems and challenges associated with modern data while enabling an organization to innovate and automate like never before. As more organizations migrate to the modern…
Read more

The post Why Migrate To The Modern Data Stack And Where To Start appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Josh Steigerwald Internship Experience

About the author: Josh is currently a senior at Stony Brook University, where he pursues studies in computer science and finance. Having spent most of his life fascinated by technology and pursuing entrepreneurial endeavors from a young age, he hopes to leverage his computer science skills and business savvy to make his mark in the tech world.

Over the past summer, I had the pleasure of working as a Software Engineer Intern at Actian. It was an incredible experience—one of learning, growth, and excitement. Continue  below to follow along my journey!

Before my internship even began, my learning started. Knowing that I’d be working with Scala and Apache Spark, I set out to show up with these tools under my belt. To accomplish this, I took a course on Udemy, read books, and practiced…a lot. By the time I had completed my first week at Actian, I felt confident in my abilities to work with these tools. I was ready to get to work!

Okay…maybe I wasn’t really ready to get to work. Orientation aside, the first week or two were a bit of a challenge. Let’s just say; you don’t know what you don’t know until you know that you don’t know it. Does that make sense? Not really? My point is, there was a lot of learning to do— from getting my development environment set up, to working with new tools and languages, to trying to understand existing codebases, everything was new to me. This was an overwhelming feeling, and at the time, I wasn’t sure how I was ever going to figure it all out. Looking back now, it’s remarkable to me exactly how much I’ve learned in just 12 weeks. That’s something that speaks to the quality of the Actian internship experience.

My job as an intern was to develop a prototype implementation of the Apache Spark Data Source V2 API to replace the existing Data Source V1 API implementation in the Spark-Vector Connector project. Once I had gotten past the learning stage in the first few weeks, I began developing the prototype implementation. This was a process that involved a lot of research and patience. After week 7, I finally saw my work come to life—I had a working implementation!

This was an exciting milestone that I was extremely excited to share with those who supported me throughout the internship. On that note…

Perhaps the greatest part of my internship experience was all of the support I had. Not only did I have the opportunity to meet with my internship “buddy” daily, but I also had an incredible amount of support from the HR team here at Actian. From weekly 1:1 meetings to discuss my internship progress, to fun intern events where I had the opportunity to meet with other interns and unwind for a bit, the HR team provided me with every ounce of support and encouragement I needed to be successful.

After my initial accomplishment of seeing my implementation working, I continued to develop more features to complete one functionality (reading data) of the Data Source V2 API. By week 9, I was ready to benchmark my implementation. This was a point in the project that really opened my eyes up to the challenges that are faced in real-world software development situations. Simply put, no matter what I tried, I could not run my benchmark queries on large enough amounts of data to come to any concrete conclusions. It seems the culprit of this was a resource-starved laptop—something that could have been worked around rather easily under normal circumstances. However, given the nature of the internship and the time constraints that accompany it, a workaround was not in the cards. As a result, benchmarking was abandoned and I moved on to continue implementing features of the Data Source V2 API with the little time I had left at Actian.

One of the most notable aspects of my internship was working not only remotely, but internationally—I’m on the east coast of the U.S., while my team was based in Germany. When I was starting my day, my team was just about to end theirs. This was a bit of a challenge, but also an opportunity for immense growth. Because of this challenge, my skills in researching and developing solutions to problems on my own were greatly sharpened. On top of that, I found it very valuable to be working for a company with such a diverse, global reach.

Towards the last few weeks of my internship, much attention was given to my capstone project presentation. This was a wonderful opportunity to present my work to the company and showcase my accomplishments. On top of that, it helped me to gain confidence in my presentation skills. Having this opportunity made it possible for me to complete my internship with a true sense of pride and accomplishment.

Overall, my experience as an intern at Actian has been one that I will always remember. I’ve gained knowledge, skills, and forged valuable connections. For these reasons and many more, Actian will always hold a special place in my heart!

If you or someone you know is interested in becoming an intern for Actian, please contact talentacquisition@actian.com. We will begin recruiting for the 2022 summer program soon, so we encourage you to send your resume today!

 

The post Josh Steigerwald Internship Experience appeared first on Actian.


Read More
Author:

Josh Steigerwald
Scoping a Master Data Management requirement

Assuming you have the executive endorsement, a budget, or a commitment from the business to improve customer data quality, the first thing you will want to do is determine the scope of work that you need to undertake as part of your customer-related data governance project.

In order to achieve effective horizontal integration with all the business units within the organization, the scoping of your MDM needs to gather the requirements of all employees and interest groups that would be expected to work with or be dependent on a new and unified approach customer master data management.

The use of a standardized technical framework will help in framing whether the solution meets the needs of each business area.

Read more

How News Corp and Near are innovating with first-party data

Data is the new ‘glue’ that will hold together the evolution of marketing and this future, itself, is being redefined at a breakneck speed. The fast-evolving trends on data and privacy are keeping marketers on their toes constantly as they get ready for a cookie-less world.

This fireside chat at The Drum Digital Summit untangles this multi-faceted theme and helps marketers to get future-ready. The panel features Suzie Cardwell, general manager client product and strategy, News Corp Australia and Shobhit Shukla, co-founder, and president of Near. The session looks into the ways in which businesses can future proof themselves using data, with insights from News Corp and Near.

Read More

Is Your Data Safe?

In her new book, Information Security Essentials, Susan McGregor outlines the crucial steps for protecting news writers, sources, organizations—and anyone—in the digital era.

Information Security Essentials: A Guide for Reporters, Editors, and Newsroom Leaders by Susan McGregor, an associate research scholar at Columbia’s Data Science Institute, is an indispensable guide for protecting news writers, sources, and organizations in the digital era. McGregor provides a systematic understanding of the key technical, legal, and conceptual issues that anyone teaching, studying, or practicing journalism should know.

McGregor answered some questions about the book for Columbia News, and also shared some reading recommendations and her party guest wish list.

Read more

Data Warehouse vs Database – Which Should you Choose?

Data warehouse vs database should you choose one or the other – or, in some cases, would you need both? Each has a purpose and a value for your organization. Each can be simple or complex; both support organizational decisions. In general, they are complementary.  The question becomes which one do I need based on what outcome the organization is trying to achieve for a given process or project. Knowing the difference can help avoid mistakes that could jeopardize the success of using either technology to support the needs of the business. Be careful not to mix up the value of the two approaches and choose the wrong one for the business processing you are expecting.

What is a Database?

A database stores data and information in a logical relationship with other data and information. Usually, a database has a particular focus relative to a specific part of a business and contains data related to a specific operation or business function and is collected during the course of running that specific operation or function. The point of the database is to store all pertinent information related to that particular operation, for example, a set of customer, employee or citizen records or the parts lists for all components for all products manufactured by a company – either would be great target use cases for a database.  Organizations can have multiple databases supported by different database systems or the same DB system. Databases can be separated in any way that the business finds valuable. Separation can be driven by performance, security, or any other valid business or technical reason. One database can be used by multiple people in different roles in the organization, each finding value, including the ability to collaborate with other departments.  Databases can be specialized to handle certain types of data and or certain data operations.  For example, a database that is being used by multiple users concurrently would need to make sure that any given dataset or element of that set is only being written to by a single user at any given time to avoid data corruption. Databases that conform to this requirement are considered ACID (Atomic, Consistent, Isolated, Durable) compliant and are used in most Online Transaction Processing or OLTP operations.

Databases for OLTP come in many types. The majority use either row-based or columnar-based architectures, but both generally use Structured Query Language (SQL) though many also provide other programmatic APIs often lumped together under a not SQL or NoSQL label.  Implementation of the architectures can vary in complexity and use.   There are small personal databases and enterprise-grade databases. Some databases have defined structures and tables or sometimes called a common data model. Then some databases have nothing related to tables and structures besides the common system structures and tables. All databases allow the creation or addition of any tables that an organization needs.  Tables may consist of very structured data with a well-defined schema or tables of semi- or unstructured data, for example, document stores or video archives.  Other databases have hybrid structures or core underlying structures such as a Key-Value Store that enables them to be very flexible about the types of data and variety that can be stored in a given database.

What is a Data Warehouse?

A data warehouse’s underlying core engine is a database.  The key difference is the degree of sophistication of its management and its focus on bringing together data from many disparate and diverse sources, aggregating it together as a set of cross-domain, departmental, or other series of operations, largely for the purposes of online analytical or OLAP processing of information outside of the actual operational process execution.  A data warehouse periodically collects information from operational databases, historically in batch mode but increasingly in real-time streams as well.  Over time, this aggregated data represents a historical dataset that serves as a baseline pattern for more advanced analytics.  However, typical usage is for reporting on operational efficiencies or other key performance indicators that drive business decisions at all levels of an organization.

There are different types of data warehouses offered by multiple vendors. Each has some of the same capabilities that define them as data warehouses. Some vendors will differentiate themselves by adding feature enhancements and additional applications that the other vendor does not have.

Data Warehouse vs Database: Key Differences

The key differences in Data Warehouse vs Database:

  • The database is used for active daily transactions such as insert, delete, update or update a record based on daily interactions within an application. This is sometimes called OLTP.
  • A data warehouse is used to analyze lots of data simultaneously, usually to produce a report or do trend analysis. This is sometimes called OLAP.
  • In any enterprise, downtime for mission-critical operations can be catastrophic, but this is far more the case with an OLTP system and the database it’s built upon – particularly one focused on financial transactions than OLAP systems and the data warehouse they’re built on as these operations tend to be outside operations.
  • Databases are optimized or normalized (sometimes called indexing) to allow quick online transactions against the data in the database with an emphasis on writes and updates. The time to perform data analysis is compacted using a normalized database because “pre-wiring” all the internal relationships between the normalized data structures accelerate query returns.
  • Data warehouses are designed to handle complex analytics without the need for normalization of data structure that a database needs to perform well. Unlike a database, multiple views of data and data redundancy are allowed.  The emphasis is one bulk yet selective reads of datasets.
  • A database can support thousands of concurrent users at one time with their access requirements to the data in the database. This support helps with capacity utilization of the database to perform needed data access at desired response times or service level agreements. Doing analytical processing can affect all user response times.
  • A Data warehouse can also support a large volume of users at one time but generally require more resources to support concurrency given the size of datasets in a data warehouse and the complexity of the queries that it runs.

With these key differences listed, it is essential to understand that a database is not in conflict with a data warehouse related to capabilities and structure. Each can add value, but the value is determined based on usage.

Why Databases in Business?

Databases are used to help organizations structure data in a meaningful way so that an understanding of data relationships can be used for decisions and enable an organization’s performance to deliver good services and products. A Database also helps different organizations’ departments work in a coordinated fashion using automated technology and tools to do their jobs without otherwise manual interventions. Databases bring related data together in one structure for data integrity.

Why Data Warehouses in Business?

Data warehouses are essential for analyzing data that should not be done with a transactional database. This analysis is necessary for discovering trends and answering any questions about the past, present, and future that the organization needs to know to make decisions. A data warehouse can take data from various sources and analyze them together.  Without data warehouses, each department in an organization may have its own data, and additional processing will be needed to use the different data sources together.

Data Warehouse vs Database: Which is a Good Fit for your Business

Database or data warehouse, which is a good fit for your business. Probably both, each has capabilities that support the business performance and ability to understand its customers. Besides that, both enable organizational collaboration and coordination across the company in an automated fashion. Each has specific capabilities that help the organization remove the constraint of using one solution for both ineffectively. Many organizations have both and use them

The post Data Warehouse vs Database – Which Should you Choose? appeared first on Actian.


Read More
Author:

Lewis Carr
David Raab of the CDP Institute says CDP and CRM should not be confused.

You do your readers a huge disservice by conflating CDP and CRM.  Yes, both store customer data – as do data lakes, data warehouses, marketing automation, email engines, personalization tools, web content managers, and a host of other systems.  Each of those is designed for a specific purpose and stores customer data in a way that fits that purpose. 

CRM also has its own purpose – to support sales and service agents when speaking with customers – and is optimized for it.  CRMs are notoriously bad at dealing with data that was imported from elsewhere, and with unstructured and semi-structured data types.   

They’re generally poor at sharing their data with other systems.   

read more

What is the difference between business dictionary and business glossary?


Is a business dictionary not the same as a business glossary? They are similar, but not the same. Both of these artifacts have the same purpose, what differs is some of the rules that govern them. Let’s uncover what these are. Table of Contents How is a business dictionary different?Governing rulesPreference between business dictionary and […]

The post What is the difference between business dictionary and business glossary? appeared first on LightsOnData.


Read More
Author:

George Firican
How To Improve Your Data Analytics Strategy For 2022 – From Improving Your Snowflakes Instance To Building The Right Pipelines


Photo by Startaê Team on Unsplash 2022 is around the corner and it is time to start looking towards improving your data strategy. Our team has seen several trends in 2021 in terms of methods which can help improve your data analytics strategy. Whether it be optimizing your Snowflake table structures to save $20,000 a…
Read more

The post How To Improve Your Data Analytics Strategy For 2022 – From Improving Your Snowflakes Instance To Building The Right Pipelines appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Tactical vs. Strategic Master Data Management

An observation in the market is that every organization recognizes that it has some sort of data issue that could be improved with the implementation of yet another solution.

There are plenty of vendors whose primary objective is to push their technology or solution without too much concern about whether the solution meets your particular business needs. Suitability is ultimately determined at the time of subscription or software maintenance renewal.

read more

What Is Airbyte and Why You Should Use It?


Data is an integral aspect of any business. It allows for solution development, metric tracking, and creates a structure for streamlined and integrated processes. Data empowers business decisions. I say that both from the fact that consulting firms like McKinsey have found that in their research companies that are using AI and analytics can attribute 20%…
Read more

The post What Is Airbyte and Why You Should Use It? appeared first on Seattle Data Guy.


Read More
Author: research@theseattledataguy.com

Be strategic … not tactical


A generation ago, when my eldest children were still young enough to be gathered up in my arms I entered a work position where my hiring manager quit his job literally a few weeks before I started.

To say that this situation was a little disconcerting would be to understate the level of concern that I had. At the same time though, I saw this as an opportunity to potentially step into the breech, so to speak, and fill the void left by the the former manager.

Such situations present themselves as both opportunities and threats but both are heavily dependent on the many circumstances that surround a job role and the work that needs to be done.

My skip-level manager, a C-suite exec, became my direct manager, though to call him a people manager is a courtesy that these days I might be more judicious about according him. In reality he was a frustrated politician, a people person to be sure, but deep down in his heart he was a process guy. He loved process!

What I didn’t appreciate at the time, is that he too, had an exit strategy in mind as he grew closer to retirement and became more threatened by young upstarts with comparable if not more impressive credentials despite being twenty plus years his junior.

His first tasks assignment to me was to establish a level of credibility within the business that would assure me implicit support from his peers at the C-suite or at the very least General Manager level.

This would be no mean feat given that the division in which I worked was largely regarded as a cost center and seen as relatively expensive when examined through a cross-charge accounting lens as a shared service.

Manufacturing was a large part of what our business did and as such carried the lion’s share of the shared service cost.

A second assignment was to determine a strategy for the business as a whole but more particularly for the manufacturing side of the business where it was felt things were a little maverick and unbridled.

No time horizons were defined but it stood to reason that as soon as I had established myself I should start the assessment of the status quo and start formulating plans.

To the experienced reader it should be now obvious that there are two elements here, the tactical and the strategic. The brief was to be strategic but in reality there is always tactical stuff to be done too.

My direct manager who had left had stuck around for around two years and then parted ways to go into Academia. His departure left no evidence of anything tactical or strategic so I was left to my own devices.

Setting a strategy

Setting a strategy is something I had done before but it had always been something undertaken with a clarity of the overall mission and vision. There was always a defined expectation around budget and some degree of commitment and certainty as to the timeline.

Things here were not so much green pastured and untilled as quite cloudy, ambiguous and quite unclear.

It has been said that engineering led organizations often don’t produce great products. Similarly, that product led organizations sometimes produce a great experience but with poorly thought out engineering design. Both are potentially true, there has to be a balance.

We had the same issue here, we had a manufacturing business with a strong reputation and excellent technical execution. But it wasn’t a very daring business, it didn’t push the boat out very far, and it took very few technical or ambition-based risks. Any roll of the dice tended to be really big and over-calculated. You could say bold, but really what it was, was that it was heavily de-risked. The chances of failure were minimized as much as possible.

Timing

Innovation was present, but it was innovation brokered on the backs of other pioneers and past success and scaled and refined within this context. What this meant is that strategic vision fell into very very long term planning and everything else was largely tactical. There were degrees of tactic though. A decision to replace broad swathes of expensive manufacturing equipment for example, was something that could be seemingly taken almost on a whim though that often concealed the fact that the decision might have been taken as many as two or three years earlier.

Similarly, a piece of equipment that was on its last legs and quite critical to the overall manufacturing operations was seemingly orphaned in terms of ownership and responsibility because it was supplier installed. Installed without much context or documentation. While it was critical in its nature, there wasn’t a lot of particular concern about it, inherenetly, because the vendor was held accountable for care and maintenance based on an expensive contract. Decisions needed to be made for the short term and for the long term depending on what the end-game was perceived to be for it.

All in all, what this conveys I suppose, is that it is easy to assume that all quick short term decisions are tactical, but that’s not entirely true. Some almost spur of the moment decisions can be taken today but have a long lasting significant impact that in effect is strategic in nature. So irrespective of time horizons you need to appropriately consider and classify.

Power plays

I already mentioned that the void left by the manager that departed meant that there was work to be done in ensuring that largest value contributing group – manufacturing – needed some reassurance that their tactical and strategic needs were being addressed.

This becomes particularly challenging when there is a credibility issue at stake. Unfortunately the relative independence of manufacturing operations up until the establishment of the shared services function meant that manufacturing felt that they had lost something with the excision of a number key resources from embedded to a central shared services model.

The task then, was to work on conveying reassurance that the same levels of service and support could be achieved for the business unit without compromise despite the changes in the service model.

When the politics and emotions of individuals is a dominant aspect of the business it is important to recognize that you may land up straddling an uncomfortable position where you are accountable to and many business leaders. For the most part this shouldn’t be too difficult to manage as long as the overall mission and intent of the collective is aligned and copacetic. Even if it isn’t you need to work out what matters in common between the disparate groups and develop thoughts and plans that harmonize the expectations and needs of divergent stakeholders.

I’d love to suggest that I did this well. I would say that I had moderate success.

Put another way, neither of the most senior decision makers either at the C-suite level or the GM for manufacturing turned me out on my ear or refused to entertain my suggestions or plans. To suggest that it was difficult to reach consensus would be to dismiss the divergent agendas of two strong personalities who were well tenured in the organization. Both wanted to leave a lasting legacy of decisions and choices that would stand the test of time long after they had left the organization but at the same time, both had different thoughts about what really mattered.

Small tactical decisions would therefore become tension points which larger more elaborate plans that could be considered more strategic in nature found a natural common ground and were easier to build consensus around.

My determination then, was that one needs to be mindful of the fact that sometimes decisions that are going to obviously be of lower impact and lower cost and therefore more likely tactical, may nonetheless become grounds on which unnecessary tug-o’-wars take place. One just needs to recognize this and work through it as best one can. Influence where you can and be aware of the power dynamics between those that matter.

What matters

My last thoughts in this piece are around sensitivity to what ultimately matters. You could think of this in the context of prioritization but it is a lot more than that, particularly if your function and the tactical and strategic plans are not considered particularly focal to the business.

My observation, in a number of industries is that several functions are not considered terribly important until they fail or become a problem. Amongst these are accounting, IT, quality assurance, personnel management. These supporting functions may be important but if they are not what defines the business or organization then they are only ever going to be supporting.

This means that investment, which is often strategic in nature, will not happen unless you can articulate or demonstrate how these will fundamentally impact the effectiveness of the organization.

I think we see this time and time again. For a municipality or local government environment for example, one recognizes that the budgeting process, collections, payables and capital management and all things financial are important but when politics is the foremost concern of the day, you have to work out how failure in any one of these lesser areas will negatively impact the political aspects.

Similarly if the business is a logistics business, with big capital spend on vehicles, rolling stock and equipment, the foremost topic is unlikely to be inventory management, communications or control systems.

Depending on your industry and your role, your influence, and the timing you have to find the traits of these lesser components impact on the effective management of what matters for the business. When you have worked out all three, you may have the perfect environment for determining how much you can strategically vs tactically.

A nod to : Jo Miller and her 2018 post on Forbes for some inspiration for this piece.


Read More
Author: Clinton Jones

An Easier and Safer Modernization Journey for Ingres and Database Applications

Actian’s Ingres NeXt Initiative paves the way for transformation

Are you considering modernizing either your Ingres database or your “green screen” ABF or thick-client OpenROAD applications? If so, you may have read about modernization options like those from Gartner listed below. Gartner has ranked these from easiest (Encapsulate) to most difficult (Replace). The easier the approach, the lower the risk, but also the lower the business benefit both technically and operationally. The more difficult the approach, the greater the risk but the greater the reward.

Source: Gartner, 7 Options to Modernize Legacy Systems

Organizations need to modernize infrastructure to respond more quickly to market changes, keep up with the competition, and support digital transformation. If you’ve been wishing that you could move Ingres to the cloud and/or extend your database applications with a modern user interface and/or web and mobile access, that’s normal. At the same time, it’s also understandable if the effort and risks involved in modernization seem terrifying. According to a study by Hitachi Vantara, more than 50% of modernization projects fail.

But what if I told you that Actian’s Ingres NeXt Initiative can provide a safe and easy way to execute a high-value modernization project involving Ingres and your database applications? Let’s have a closer look at how we do this.

Modernizing Your Ingres Database

The Ingres NeXt Initiative provides four flexible options to meet your needs.

  • On Premises (Linux, Windows, UNIX, VMS, virtualized or bare metal)
  • Bring Your Own License (Your Infrastructure, Your Software, Self Managed)
  • Platform as a Service (Actian Infrastructure, Your Software, Actian Managed)
  • Database as a Service (Actian Infrastructure, Actian Software, Actian Managed)

On Premises

Striking a balance between preservation and innovation with Ingres has led to continuation of our heritage UNIX and VMS platforms as well support for Linux and Windows. These are available on-premises as virtualized or bare-metal options.

Bring Your Own License

Bring-Your-Own License (BYOL) allows you to simply host and manage your Ingres database in the cloud. Container-based deployment delivers a more portable and resource-efficient way to virtualize the compute infrastructure. Because containers virtualize the operating system rather than the underlying hardware, applications require fewer virtual machines and operating systems to run them. Plus, containers are more lightweight than traditional virtualization schemes and make deployments more predictable, dependable, and repeatable.

Platform as a Service

Customers can use their existing licenses in the Platform as a Service (PaaS) option that completely rearchitects Ingres as a cloud-native solution hosted and managed by Actian. This is a big deal since a cloud-native approach delivers the full advantages of the cloud computing delivery model such as cloud scalability, cloud economics, portability, resource efficiency, improved resiliency and availability, and stronger security in the cloud.

The Ingres cloud-native solution will be a component of the Avalanche™ hybrid-cloud data warehouse, integration, and management platform and will be architected around a combination of containers and microservices that provide functions such as monitoring, alerts, maintenance, patching, metering, authentication (to name just a few). Microservices offer many benefits, but the major one is that, unlike a monolithic architecture, each component is deployed and scaled independently of other services.

Database as a Service

Database as a Service (DBaaS) provides the very same benefits as PaaS for new licensed purchased through Actian.

Safe and Easy Database Modernization

The best thing about BYOL, PaaS and DBaaS is that you a choice of Google Cloud, Microsoft Azure, or Amazon AWS without the risk and effort that’s typically associated with moving to the cloud. Actian has already done the heavy lifting. We ensure uniform operation at the SQL and business logic level across on-premise deployments on UNIX, VMS, Linux and Windows platforms and the cloud. We’ve fully tested the Ingres BYOL option and provide cloud license portability, ensuring a smooth cloud deployment. Visit the Actian Academy to learn more.

Modernizing Database Applications

As for modernizing your database applications?  That’s where OpenROAD comes in. OpenROAD is a database-centric, object-oriented, 4GL rapid application development (RAD) tool that lets you develop and deploy mission-critical, n-tier business applications on Windows, Linux and Unix connecting to databases such as Ingres, Microsoft SQL Server, Oracle, Actian Zen, Actian X, and more via ODBC.

The Ingres NeXt Initiative provides four options to modernize your application infrastructure using OpenROAD:

ABF and Forms-Based Apps 

OpenROAD migration tools allow you to modernize “green screen” Ingres Applications-By-Forms (ABF) and form-based applications by converting them into OpenROAD frames as shown in Figure 3.  Modernized applications support cloud and on-premises databases.  

OpenROAD Fat Client 

OpenROAD thick-client applications can be transformed to browser-based equivalents without the cost, resource, effort, or risk associated with rewriting or replacing code. Developers can then extend these applications for web and mobile deployment, using HTML5 and JavaScript. Further, OpenROAD supports incremental application migration where modernized applications can run alongside unconverted applications.  

OpenROAD Server 

OpenROAD supports encapsulating and deploying business logic within the OpenROAD Server. Reuse of existing business logic avoids rewriting decades of business logic from scratch. A client, such as an HTML page with JavaScript, can connect to the OpenROAD Server via JSON-RPC with no additional libraries or plugins/add-ons.  

OpenROAD as a Service 

OpenROAD as a Service delivers an Actian hosted and managed OpenROAD Sever. Business logic is exposed as a web service that is available to web-deployed applications.  

Take the Next Step

Actian’s Ingres NeXt Initiative is designed to help you modernize and make the most of your existing and future investments in Ingres and OpenROAD. You can choose to modernize Ingres or your database applications. Or you can choose both with efforts that can occur concurrently or sequentially, depending on your needs. Register here to learn more about our Early Access Program.

 

 

 

 

 

The post An Easier and Safer Modernization Journey for Ingres and Database Applications appeared first on Actian.


Read More
Author:

Teresa Wingfield