Skip to content Skip to sidebar Skip to footer

Pandas Combine Excel Spreadsheets

I have an Excel workbook with many tabs. Each tab has the same set of headers as all others. I want to combine all of the data from each tab into one data frame (without repeating

Solution 1:

This is one way to do it -- load all sheets into a dictionary of dataframes and then concatenate all the values in the dictionary into one dataframe.

import pandas as pd

Set sheetname to None in order to load all sheets into a dict of dataframes and ignore index to avoid overlapping values later (see comment by @bunji)

df = pd.read_excel('tmp.xlsx', sheet_name=None, index_col=None)

Then concatenate all dataframes

cdf = pd.concat(df.values())

print(cdf)

Solution 2:

import pandas as pd  

f = 'file.xlsx'
df = pd.read_excel(f, sheet_name=None, ignore_index=True) 
df2 = pd.concat(df, sort=True)

df2.to_excel('merged.xlsx', 
             engine='xlsxwriter', 
             sheet_name=Merged,
             header = True,
             index=False)

Post a Comment for "Pandas Combine Excel Spreadsheets"