How to Batch Convert Credit Card Statement PDF to Excel

Note: due to pdfplumber imperfection in detecting blank cells. You do need to cut and align the date and value columns with the text labels.

Prerequisites

Before proceeding, ensure you have the following installed on your system:

Python (latest version)
pdfplumber Python library
PowerShell or Command Prompt

Step 1: Install Python

Open PowerShell or Command Prompt.
Download and install Python:
- Visit Python’s official website and download the latest version.
- Run the installer and ensure you select “Add Python to PATH” before clicking Install Now.
Verify installation by running:
```
python --version
```

Step 2: Install `pdfplumber`

Open PowerShell or Command Prompt.
Install pdfplumber using pip:
```
pip install pdfplumber pandas openpyxl
```

Step 3: Prepare PDF Files

Place the PDF files you want to convert in the Downloads folder.
- Example path: C:\Users\YourUsername\Downloads\A

Step 4: Run the Script to Convert PDF to Excel

Open Notepad and paste the following Python script: (Remember update folder path & password if any) ```import pdfplumber import pandas as pd import os

# Folder containing PDFs & file password if any pdf_folder = r”C:\Users\YourUsername\Downloads\A” password = “xxxx”

# Get all PDF files in the folder pdf_files = [f for f in os.listdir(pdf_folder) if f.lower().endswith(“.pdf”)]

for pdf_file in pdf_files: pdf_path = os.path.join(pdf_folder, pdf_file) output_file = os.path.join(pdf_folder, pdf_file.replace(“.pdf”, “.xlsx”))

   data = []
   max_columns = 0  # Track max columns for consistency
  
   try:
       with pdfplumber.open(pdf_path, password=password) as pdf:
           print(f"✅ Opened PDF: {pdf_path} ({len(pdf.pages)} pages detected)")
  
           for page_number, page in enumerate(pdf.pages, start=1):
               print(f"🔹 Processing Page {page_number}")
  
               tables = page.extract_tables()
               if not tables:
                   print(f"⚠️ No structured table found on Page {page_number}, trying text extraction...")
                   page_text = page.extract_text()
                   if page_text:
                       tables = [page_text.split("\n")]  # Treat raw text lines as table rows
  
               if tables:
                   print(f"✅ {len(tables)} table(s) found on Page {page_number}")
                   for table in tables:
                       for row in table:
                           if row is None:
                               empty_row = [""] * max_columns if max_columns > 0 else [""]
                               data.append(empty_row)
                               continue
  
                           while len(row) < max_columns:
                               row.append("")
  
                           split_rows = []
                           row_max_lines = 1
  
                           for cell in row:
                               if cell:
                                   lines = cell.strip().split("\n")
                                   split_rows.append(lines)
                                   row_max_lines = max(row_max_lines, len(lines))
                               else:
                                   split_rows.append([""])
  
                           for i in range(row_max_lines):
                               new_row = [col[i] if i < len(col) else "" for col in split_rows]
                               data.append(new_row)
  
                           max_columns = max(max_columns, len(row))
  
   except Exception as e:
       print(f"❌ Failed to process {pdf_file}: {e}")
       continue
  
   for i in range(len(data)):
       while len(data[i]) < max_columns:
           data[i].append("")
  
   df = pd.DataFrame(data)
   df.columns = [f"Column {i+1}" for i in range(max_columns)]
   df.to_excel(output_file, index=False)
   print(f"✅ Conversion done! Excel file saved at: {output_file}") ```

Save the file as convert_pdf_to_excel.py in the Downloads\A folder.
Open PowerShell or Command Prompt and navigate to Downloads\A:
```
cd %USERPROFILE%\Downloads\A
```
Run the script:
```
python convert_pdf_to_excel.py
```

Step 5: Check the Output

The converted Excel files will be saved in the Downloads folder.
Open the .xlsx file using Excel to verify the extracted data.

Troubleshooting

If Python is not recognized, restart your computer or reinstall Python ensuring “Add Python to PATH” is selected.

If pdfplumber is missing, try reinstalling it with:

pip install --upgrade pdfplumber pandas openpyxl

If the script doesn’t work for your PDFs, some documents may have embedded images instead of selectable text.

By following these steps, you can efficiently extract tables from PDFs into Excel using PowerShell or Command Prompt with Python.

Step 6: Combine all excels into a single excel file

Open Notepad and paste the following Python script:

import pandas as pd
import os
   
input_folder = os.path.expanduser("~/Downloads/A")
output_file = os.path.join(input_folder, "combined.xlsx")
   
all_data = []
for file in os.listdir(input_folder):
    if file.endswith(".xlsx") and file != "combined.xlsx":
        file_path = os.path.join(input_folder, file)
        df = pd.read_excel(file_path, sheet_name=0)
        all_data.append(df)
   
combined_df = pd.concat(all_data, ignore_index=True)
combined_df.to_excel(output_file, index=False)
print(f"All Excel files combined into {output_file} successfully!")

Save the file as combine_excels.py in the Downloads folder.
Open PowerShell or Command Prompt and navigate to Downloads:
```
cd %USERPROFILE%\Downloads\A
```
Run the script:
```
python combine_excels.py
```

Give it a try and simplify your expense tracking! 🚀

Support Me 💖

Share on

Twitter Facebook LinkedIn

CreditCard2Excel – Convert Credit Card Statements to Excel

Matthew Choo

How to Batch Convert Credit Card Statement PDF to Excel

Prerequisites

Step 1: Install Python

Step 2: Install `pdfplumber`

Step 3: Prepare PDF Files

Step 4: Run the Script to Convert PDF to Excel

Step 5: Check the Output

Troubleshooting

Step 6: Combine all excels into a single excel file

Support Me 💖

Share on

You may also enjoy

How Exactly Xiaomi Interconnectivity Home Screen+ Works?

Make Image Background Transparent - EXE App

Make Image Background Transparent Without Any App

Chinese Text Extraction from Images Using Python and Tesseract OCR

CreditCard2Excel – Convert Credit Card Statements to Excel

Matthew Choo

How to Batch Convert Credit Card Statement PDF to Excel

Prerequisites

Step 1: Install Python

Step 2: Install pdfplumber

Step 3: Prepare PDF Files

Step 4: Run the Script to Convert PDF to Excel

Step 5: Check the Output

Troubleshooting

Step 6: Combine all excels into a single excel file

Support Me 💖

Share on

You may also enjoy

How Exactly Xiaomi Interconnectivity Home Screen+ Works?

Make Image Background Transparent - EXE App

Make Image Background Transparent Without Any App

Chinese Text Extraction from Images Using Python and Tesseract OCR

Step 2: Install `pdfplumber`