The Audit Analytics

A common analysis is to determine whether all purchase invoices have been correctly and completely recorded in the general ledger. Why? Well, you don’t want to miss ghost invoices, detect duplicate bookings, and simply ensure that what has been received has also been properly recorded.

Supplier Analysis

What data do we need?

To check this, we need two datasets:

Purchase invoices – This is the list of all received purchase invoices, including invoice numbers, suppliers, invoice amounts, and invoice dates. Download the dummy data here.
Bookings in the financial administration – This is the list of all booked purchase invoices, with amounts and invoice numbers as recorded in the administration. Download the dummy data here.

Now, the question is: how do we match these lists and find discrepancies?

Step-by-step analysis with Python

Let's use Python to perform this check. We will look at:

Missing invoices
Duplicate bookings
Differences in amounts

Step 1: Load dummy data

First, we load the dummy datasets.

import pandas as pd

# Read the audit file (bookings) and invoice file with date parsing
bookings = pd.read_csv("dummy_auditfile_accounts_payable.csv", parse_dates=["Date"])
invoices = pd.read_csv("dummy_invoices_analysis.csv", parse_dates=["InvoiceDate"])

# We filter only the purchase bookings, as we know they are recorded under account 2000.
purchase_bookings = bookings[bookings['AccountNumber'] == 2000]

# Merge the files based on InvoiceNumber
# Using 'outer' join to include all records from both files, including mismatches.
df = purchase_bookings.merge(invoices, on="InvoiceNumber", how="outer", suffixes=("_invoice", "_auditfile"))

# Display the first few rows of the merged dataset to verify the merge was successful
print(df.head())

Analysis 1: Identify missing invoices

Often, we reason from the audit file; do we see the bookings also reflected in the invoices?

# Identify bookings without a corresponding invoice
bookings_without_invoice = df[df['TotalAmount'].isna()].reset_index()
print(bookings_without_invoice)

In some cases, it is also useful to check purchase invoices that were not booked.

# Identify invoices missing from bookings
missing = invoices[~invoices['InvoiceNumber'].isin(bookings['InvoiceNumber'])]
print(missing)

If you are sharp, you will notice that in this last check, we do not use the merged dataframe at all but work directly with the source tables. I wanted to show you this as well. This approach is only useful for checking whether certain values exist. For comparisons (as in Analysis 3), it is much better to use a merged table.

Analysis 2: Detect duplicate bookings

Sometimes invoices are accidentally booked twice. We check if an invoice number appears more than once in the bookings.

duplicate_bookings = bookings[bookings.duplicated('InvoiceNumber', keep=False)]
print(duplicate_bookings)

Analysis 3: Detect amount discrepancies

Now, we check whether the booked amounts match the original invoices.

# Identify discrepancies between invoice amounts and booked amounts
# This also accounts for negative amounts (creditor invoices)
discrepancies = df[(df['Amount']*-1 != df['TotalAmount']) & (~df['TotalAmount'].isna())]
print(discrepancies)

In this dummy set, we have included quite a few mismatches. To calculate the total discrepancy, you can use the following:

total_discrepancy = discrepancies.groupby("InvoiceNumber").apply(lambda x: (x["Amount"] - x["TotalAmount"])).sum()
print(total_discrepancy)

Conclusion

These kinds of checks are incredibly helpful for detecting errors, and implementing them wasn’t too difficult, right? Download the dummy set and try it out yourself!

Invoices in the Administration