Data ingestion explained

Audit firms rely on data to make accurate business decisions, predict trends, forecast the market, plan for future needs, and understand their customers. The data might be in different formats and come from various sources, but auditors are generally not experts on any of these. Since the data comes from different places, it needs to be cleansed and transformed in a way that allows you to analyse it together with data from other sources. Otherwise, your data is like a bunch of puzzle pieces that don’t ﬁt together.

Put simply, data ingestion is the process of preparing data for analysis, so you can see the big picture hidden in your data.

Automated data ingestion works by integrating data from disparate sources, such as Enterprise Resource Management (ERP) systems and HR systems. The integrated data is then transferred to a safe location the auditor has access to where it can be deposited and analysed. The data can be stored in a database, data warehouse, document store, data mart, spreadsheets, either in your own environment or in a data extraction provider’s environment through a SaaS (Software as a Service) data platform.

Data-powered audit processes

Data is the fuel that powers many of an organisation’s mission-critical processes, from business intelligence (BI) to predictive analytics; data science to machine learning (ML). However, to be of any use to an organisation, the data needs to be plentiful, readily available, and clean.

The process of data ingestion usually includes steps called ETL – Extract (taking the data from its current location), transform (cleansing and normalising the data), and load (placing the data in a database ready for analysis). Organisations typically ﬁnd the process of extracting and loading data fairly easy. However, many run into challenges with the transform part. This results in data sitting idly and of little use for analysis purposes, because the data isn’t in a suitable format to process.

So, how is data ingested?

Data can be ingested in real-time, in batches, or in a combination of the two (called lambda architecture). Batched data is imported at regularly scheduled intervals, which is good when you need data at speciﬁc time points. Real-time ingestion is useful when the information gleaned is very time-sensitive, but it can be very resource-intensive. Lambda architecture approaches attempt to balance the beneﬁts of batch and real-time modes by using batch

processing to provide comprehensive views of batch data, while also using real-time processing to provide views of time-sensitive data.

Engine B’s data ingestion tool

Due to the ever-expanding growth of corporate data and lack of reliable, standardised data, audit firms are currently unable to perform analytics at scale. This means expensive and highly trained audit staff spend days tidying and reconciling evidence that could be checked in seconds if the data were properly formatted. As a result, audits are more expensive and audit teams have less time to focus on risk and quality. And these challenges are only getting worse over time.

Engine B’s data ingestion product, EB Integration Engine, is solving these data challenges for auditors by automating the extraction process and mapping data to a common format which enables scalable analysis and gives organisations more control and consistency over their data extraction tools.

EB Integration Engine ingests data and integrates it into the Audit Common Data Model (CDM), which prepares it for integration with all the ﬁrm’s technology. Alternatively, the data can be analysed in Microsoft Power BI or any client analytics tool.

Why is our data ingestion tool unique?

EB Integration Engine reads the entire ERP system from accounts receivable to fixed assets to employee listings and brings in every data point. It also brings in unstructured data like invoices and contracts or external data from company’s registries or financial markets.

Want to find out more? Download our guide to the EB Integration Engine for a more detailed look at the tool in action.

Name	Domain	Purpose	Expiry	Type
wpl_user_preference	engineb.com	WP GDPR Cookie Consent Preferences	1 year	HTTP
YSC	youtube.com	YouTube session cookie.	52 years	HTTP

Name	Domain	Purpose	Expiry	Type
__hstc	engineb.com	Hubspot marketing platform cookie.	6 months	HTTP
__hssrc	engineb.com	Hubspot marketing platform cookie.	52 years	HTTP
__hssc	engineb.com	Hubspot marketing platform cookie.	Session	HTTP
VISITOR_INFO1_LIVE	youtube.com	YouTube cookie.	6 months	HTTP

What is data ingestion?

Data ingestion explained

Data-powered audit processes

So, how is data ingested?

Engine B’s data ingestion tool

Why is our data ingestion tool unique?

‍

Engine B

What is data ingestion?

Data ingestion explained

Data-powered audit processes

So, how is data ingested?

Engine B’s data ingestion tool

Why is our data ingestion tool unique?

‍

Engine B

Our cookie policy