Automate Your Excel Workflow with XlsxWriter and Python

9 min readJun 1, 2023

If you’ve ever had to create a large number of Excel files manually, you know how time-consuming and error-prone the process can be. Fortunately, there’s a better way: by using the Xlsxwriter library in Python, you can easily automate the process of creating and updating Excel files. With xlsxwriter, you can write code that generates complex spreadsheets with formatting, charts, and formulas.

I will be utilizing the powerful combination of Pandas and XlsxWriter. In this article, we’ll explore how you can harness the power of these two Python libraries to streamline your Excel workflow and unleash a new level of data analysis and presentation. Get ready to discover the ultimate duo for creating visually appealing and informative Excel reports that will take your data analysis to the next level.

Documentations

Getting Started

$ pip install xlsxwriter

When using the XlsxWriter engine with Pandas, there is no need to explicitly import the XlsxWriter library. Instead, you can simply specify the engine parameter in the pd.ExcelWriter() function call as engine="xlsxwriter". This tells Pandas to use the XlsxWriter engine to write data to the Excel file.

This line of code creates a new ExcelWriter object from Pandas and assigns it to the variable writer. The ExcelWriter object is used to write data to an Excel file named 'report.xlsx'.

The third parameter, engine_kwargs, is an optional dictionary that specifies additional keyword arguments to pass to the XlsxWriter engine. In this example, we specify an options dictionary with the strings_to_numbers key set to False, which instructs XlsxWriter to treat all data values as strings in the Excel file.

Once the ExcelWriter object writer is created, it can be used to write dataframes to the Excel file using the to_excel() method of Pandas dataframe, which is precisely what we’ll be doing.

I created a Dataframe with some fictional data.

Function to create a dataframe with fictional data

After creating the writer object, and loading the dataframe the next step is to use the to_excel() method to write the data to the Excel file. This method accepts the writer object as the first argument, followed by several optional parameters such as the sheet name, index, and header options. The file name or path is defined in the writerobject. Additionally, the to_excel method offers advanced options like startrow and startcol that determine the starting cell for writing the data. By default, the startrow and startcol parameters are set to zero, which corresponds to the cell A1. These parameters can be useful when appending data to an existing Excel file.

In the following example, I’ve set the start_row parameter to 3 and the start_col parameter to 0, which corresponds to cell A4. The sheet_name parameter is set to 'Sales', while the index and header options are both set to True. Additionally, I've manipulated the dataframe to start its index from 1 instead of 0.

After writing the data, the next thing to do is to save and close the file, this can be done by using the close()method.

The file will be saved in the path specified, if no path is specified, it will be saved in your working dir.

Workbook Formatting

After writing the data to the sheet, you might want to enhance its appearance. With the ExcelWriter library, you can format cells using the format class to make your sheet look more presentable. You can format various cell properties, including fonts, colours, patterns, borders, alignment, and number formatting. Additionally, you can set the width and height of columns and rows. The Format object is used to define cell formatting, which can be created by calling the add_format()method on the workbook.

font_20 = workbook.add_format('font_size': 20)

In order to make the sales report look even better, I’ll be using the format class to add some styling to the report.

I have created quite a number of different format objects ranging from border types to number boldness. As seen below, the properties of the format objects can be set by passing a dictionary of properties to the add_format() constructor.

You can have as many different formats as necessary, just make sure to keep track of them.

Formats used in the project

Here are some of the stylings I applied to the report:

Report Title: To create the report title, I used the title_text and subtitle_text formats. I made the font bold and increased the font size to make it more prominent. I also centred the text alignment to ensure it’s visually appealing. Additionally, I merged certain columns together using the merge_range()method to provide a cohesive title section. Within the text formats, I adjusted the font colour to stand out and catch attention. I chose a blue background colour to enhance the visual appeal and make the title section more attractive, and distinct from the rest of the sheet.

Code to add the report title

2. Profit Column: I added profit information to my data, by carrying out spreadsheet calculations — something every data analyst would want to do using the write_formula() method. PS: the write() method works for formulas too.

write_formula(row, col, formula[, cell_format[, value]])

I calculated the profit for each row of the report table by looping through the length of the report and using string formatting to dynamically reference the row index for calculation. The formula was written to a specific cell using cell notation (e.g., A1) and the ‘=’ sign was included. Additionally, I assigned a cell format, ‘number’ to the calculated profit using the cell_format parameter, which specifies the style format to apply to the cell. The value parameter, on the other hand, was not utilized in this calculation as it is primarily used when working with non-Excel applications.

running profit calculation on all rows in the report

3. Report Total: At the bottom of the report, I included a total row to display the overall cost, sales and profit. To achieve this, I utilized the write_formula() method. To enhance the visibility of the total row, I applied formatting options to make the text and numbers in the cells appear bold. By doing so, the total values stand out and draw attention to the overall performance of the sales report.

4. Borders: To enhance the appearance of the report, I applied custom borders to the table using the conditional_format() method. This method allows you to add formatting to cells based on specific conditions that you define.

By leveraging the conditional_format() method, I was able to add borders to the entire report rather than looping through cells individually. This approach proved to be more efficient and straightforward. The conditional_format() method offers a range of customization options, making it a powerful tool for achieving various formatting requirements.

conditional_format(first_row, first_col, last_row, last_col, options)

The options parameter plays a significant role in the conditional_format() method. It is a dictionary that contains various parameters defining the type and style of the conditional format applied to the cells. Understanding these parameters will enable you to create powerful and customized formatting rules.

One crucial parameter is type, which determines the type of conditional format to apply. The type parameter is required and offers several allowable values, including but not limited to cell, date, text, blanks, no_blanks, and more. Each type corresponds to a specific condition or formatting rule that you can apply to the cells.

Additionally, there are other important parameters such as format, criteria, value, minimum, and maximum. These parameters allow you to further refine the conditional format based on your specific requirements. Exploring the documentation for the conditional_format() method will provide you with a comprehensive understanding of all the available parameters and their functionalities.

I defined two types of border formats: thin_borders and thick_borders, specifying the weight and colour of the borders. To add borders to all cells of the report, including both blank and non-blank cells, I used the blanks and no_blanks options within the type parameter, ensuring that all empty or non-empty cells were included in the formatting. This allowed me to apply the border formats consistently across the entire report.

5. Number and Date Format: I utilized the versatile conditional_format() method to address the formatting of numerical values within the table, specifically the "cost" and "sales prices" columns. Initially, these numbers were displayed without any clear demarcation, making them difficult to read and interpret.

To enhance the readability and facilitate easier comprehension, I incorporated the usage of thousands separator for the numerical values. By introducing this formatting feature, each number was intelligently separated by commas, aiding in visual segmentation and enhancing the overall legibility of the data. This small but impactful adjustment significantly improved the clarity and interpretability of the numerical information within the table.

I also reformatted the date values to make it easier to read, using the conditional format method.

In both cases, I only needed to run the conditional format on non-empty cells.

Adding number and date formats to report

6. Column Widths: Due to the width of the price column data exceeding the default Excel column width, the set_column() method was utilized to expand the width of specific columns accordingly.

Setting the width of columns B - F to 15

7. Hiding Gridlines: An underrated addition I made to the sheet was to hide the gridlines, which draws focus to the sales report table.

writer.sheets['Sales'].hide_gridlines(2)

The final report with all the stylings and formatting looks much better than the initial report we started with.

Creating Multiple Sheets

Creating multiple sheets in a workbook is a common practice among data analysts to organize and separate different operations. With XlsxWriter, creating additional sheets is straightforward. You can utilize the same method used to create the first sheet to create new sheets. In cases where you don’t have a specific dataframe to write to the new sheet, you can create an empty dataframe using pd.DataFrame().

#create sheet NEW using the writer object
pd.DataFrame().to_excel(writer, sheet_name='NEW', index=True, 
                        startrow=0, startcol=0, header=True)

The code snippet above would create a new empty sheet in our ‘report.xlsx’ workbook, and we can go ahead to write on this sheet or add visualizations.

Visualization

In XlsxWriter, you can harness the power of data visualization by creating compelling charts.

To demonstrate this, I utilized pandas and XlsxWriter in tandem to generate a basic column chart that showcases the total cost and sales prices per item. Leveraging pandas’ groupby method with the sum aggregate function, I extracted the necessary values from my dataset. With this data in hand, I then proceeded to create the column chart using XlsxWriter. By specifying the appropriate data range and chart options, I was able to visually represent the aggregated information in an intuitive and informative manner. This column chart not only allows for a quick comparison between cost and sales prices but also enhances the overall understanding and analysis of the data at hand.

#Aggregate data by items and calculate total cost and sales prices
items_cost_sales_df = report_data.groupby('Items')[['Cost Price', 
                                                     'Sale Price']].sum()

The visualization is created on a separate sheet ‘Chart’ in the workbook. The column chart shows the total cost of each item versus the sales price.

Code to create a column chart

Conclusion

With the completion of all the steps above, a robust report template is now being developed. The template can be further automated to generate monthly or yearly reports effortlessly, eliminating the need to repeat the analysis process from scratch. By utilizing the power of programming and data manipulation libraries like Pandas and XlsxWriter, report generation can be streamlined and made more efficient. This automation allows for quick and accurate generation of reports, saving valuable time and effort for data analysts. The template serves as a foundation for generating consistent and comprehensive reports, enabling businesses to gain valuable insights and make data-driven decisions.

Xlsxwriter can be used for much more complex reporting, explore the documentation to unlock its potential.

The code used for this article can be found in this GitHub repository.

Drop some claps and follow the account if you found this useful.

Thank you!

Automate Your Excel Workflow with XlsxWriter and Python

Getting Started

Workbook Formatting

Creating Multiple Sheets

Visualization

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Temitope Bimbo Babatola

No responses yet