Get Comfortable Wrapping Yourself Around Python

As more tools become necessary for analyzing large and unstructured data sets in today’s digital environment, the open-source programming language Python has emerged as an important contributor to audits, internal audit, IT audit, business planning and forecasting, as well as traditional accounting roles. This feature explains why Python has gained popularity among financial professionals and discusses its relevance to CPAs.


by J. L. “John” Alarcon, CPA, CGMA, CITP, Kevin C. Moffitt, PhD, and Cory Ng, CPA, DBA, CGMA Jun 8, 2022, 18:13 PM


pa-cpa-journal-get-comfortable-wrapping-yourself-around-python
As more tools become necessary for analyzing large and unstructured data sets in today’s digital environment, the open-source programming language Python has emerged as an important contributor to audits, internal audit, IT audit, business planning and forecasting, as well as traditional accounting roles. This feature explains why Python has gained popularity among financial professionals and discusses its relevance to CPAs.

What Is Python?

Released around 1990, Python is a freely available programming language commonly used by data scientists for data collection, analysis, and visualization. Python is particularly useful in big-data environments with large amounts of structured data (such as data from relational databases) and unstructured data (such as textual data from various sources). Another popular programming language data scientists use is R (also released in the 1990s). R, too, is an open-source programming language mainly used for statistical analysis. Both programming languages are widely used by data scientists. 

Considered easy to use for beginners, Python also is a general-purpose programming language: it can be used for many other purposes beyond data analysis, such as automating tasks, designing a new website, or building software applications. 

Standard data analytics software applications, such as Alteryx, Power BI, SAS, and Tableau, can now be used in conjunction with integrated Python or R libraries of algorithms and functions. This enables users to combine the capabilities of the technologies to perform custom or advanced tasks in areas such as data extraction, text mining, and machine learning.

Why CPAs Should Care about Python?

Already popular among data analysts and developers in many industries, Python has gained considerable momentum in the world of financial services and financial technologies (Fintech). As part of this trend, it has started to emerge as a helpful resource for accountants. 

The AICPA recently introduced online resources to help CPAs incorporate Python into financial statement audits. Some resources1 include a video, “Upgrade the Financial Statement Audit with Python,” and a white paper on how to use Python. The AICPA white paper discusses that, in the context of a financial statement audit, Python can help with extracting, formatting, loading, testing, and analyzing data as well as the visualization and documentation of results. 

In 2020, CPA Canada published a report suggesting that CPAs should learn how to code for automation, data analytics, and visualization and become comfortable with programming languages such as Visual Basic for Applications (VBA), Python, and Structured Query Language (SQL).2 The report pointed out that the ability to code is particularly crucial for combining, filtering, and preparing data from different sources and automating labor-intensive processes, which is especially true when a CPA’s job function requires extracting insight from large amounts of data.

A survey of 992 members of the Association of Chartered Certified Accountants (ACCA) in November 2020 indicated that 57% of respondents had no knowledge of coding, but 40% expressed an interest in learning and that, three years on, none of the respondents wanted to remain without any knowledge of coding.3 Here is what the report said: 
“Of particular interest are fourth-generation programming languages like Python where the code is expressed in a very intuitive way similar almost to writing a sentence in natural language. These types of languages can lend themselves to use cases like data analysis, data visualization, and scripts for customized reports to reduce low-value repetitive tasks, which are relevant for accountancy and finance professionals.”

The report concluded that coding can be a valuable skillset, suggesting that professional accountants should consider acquiring at least a basic understanding of coding to add value to their organizations and to prepare for new career opportunities moving forward. 

There is disagreement among professionals and educators regarding the relative importance of accountants learning programming skills as indicated by other surveys,4 however it is expected that universities and professional education organizations will significantly increase technology and data analytics learning in the future, including incorporating artificial intelligence and machine learning. It has become clear that CPAs will have to learn more technologies than they had before to add value to their organizations in the digital era. 

Python from an Accountant’s Perspective

In the context of corporate finance and accounting, Python would be particularly relevant in job functions requiring data analytics and process automation, such as in audits, internal audit, IT audit, or business planning and forecasting, but it may also be useful in traditional areas by automating highly manual and time-consuming tasks. It can be used as a standalone tool or in conjunction with standard software applications such as Excel, data analytics software (e.g., Alteryx, Power BI, SAS, Tableau), audit software (e.g., Galvanize – now part of Diligent – and CaseWare IDEA), or robotic process automation (RPA) software (e.g., Automation Anywhere, BluePrism, UiPath). When used in combination with a standard software application, Python typically performs custom or advanced data analysis or processing tasks by leveraging its extensive open-source libraries of algorithms and functions, which are particularly useful for mining unstructured data (such as text mining), including using AI and machine learning functions.  

Besides benefitting from the increased automation enabled by Python in financial applications, CPAs may find significant benefits in Python’s data extraction, data preparation, and data analysis (including visualization) facilities. 

Data acquisition – Whether interfacing with an enterprise resource planning (ERP) system, an application programming interface (API), a database, or the web, a basic to intermediate understanding of Python can greatly facilitate data acquisition tasks. For example, in the case of the SAP ERP system, after installing the necessary SAP module and Python connector, an advanced user can establish a connection with the SAP system, after which data from tables can be extracted for further data manipulation and analysis in Python.5 APIs allow for the establishment of communication between two different applications for data gathering and application development. For example, Yahoo Finance and SEC EDGAR are searchable via yfinance and sec-api, APIs developed for Python programmers.6 These APIs allow for real-time tracking and integration of stock market and financial filing data. Myriad APIs exist that access other websites and organizational data (some with premium access), including Wikipedia, Twitter, and Bloomberg. If a programmer has direct access to a database, then establishing a connection through the Open Database Connectivity (ODBC) API will allow the programmer to execute SQL statements. SQL statements can be used to insert, update, and extract data from database management systems, among other things. The internet contains vast stores of unstructured data in the forms of audio, video, and text. Scraping data from the web using Python is facilitated by libraries such as BeautifulSoup and Scrapy.

Data preparation and analysis – Data is rarely ready for consumption or analysis without passing through a data preparation stage. In this stage, data may be cleaned, transformed, standardized, or normalized. These tasks become increasingly complex when the number and variety of data sources being merged increases. Manipulating data with a programming language provides the most flexibility in data preparation, and Python libraries such as Numpy, Pandas, and Scikit-Learn are powerful resources. 

Numpy is a well-established Python library that undergirds some of the most popular and useful science, math, machine-learning, and visualization Python libraries, which leverage Numpy’s multidimensional arrays. Pandas is a Python library that is extremely adept at loading, manipulating, and exporting data using dataframes – two-dimensional arrays with columns and rows and an explicitly defined index, or primary key-type column. Dataframes can easily be created from Excel or JSON files, then merged together, sliced, and exported to a desired file type.

There are numerous Python libraries for visualizing data, and the usefulness of each depends on the task at hand. Matplotlib uses the Numpy library and implements many of the same visualization features as MatLab7 and it is the basis for other visualization libraries, including Seaborn, a library that specializes in the depiction of statistical analyses. Other libraries specialize in interactive visualizations, web design, and touch-screen interfaces. 

Machine learning and AI are areas that Python addresses especially well. PyTorch is a Facebook-developed, state-of-the-art opensource framework and software library to facilitate building machine-learning applications. It is geared toward moving research and prototyping to the production environment. Tesla, Uber, and Microsoft rely on PyTorch implementations. Tensorflow is roughly the Google equivalent of PyTorch, and it too can be accessed through Python.

What should be of particular interest to CPAs and auditors is how Python interacts with the software they already use. Power BI Desktop can run Python scripts directly with the installation of Python, Pandas, and Matplotlib. Caseware IDEA and Galvanize (now part of Diligent) similarly support Python scripting for data transformation and analysis. Alteryx has a built-in Python tool that mimics the Jupyter Notebook interface and includes many of the libraries described above.8 Tableau also can connect directly to Python. In all of these cases, the flexibility and power of Python can drive current application use to more insightful levels. 

Future of Coding

Along with the growing popularity of Python, trends in application development include the proliferation of no-code/low-code (NCLC) platforms and the increasing use of AI and machine learning to automate coding. NCLC tools allow “citizen-developers” without a programming background to build applications using a drag and drop interface. Thus, anyone within an organization could potentially create custom applications much faster and at a lower cost compared with traditional programming methods.

CPAs are already using NCLC tools. Creating a macro in Excel or using certain advanced functions of data analytics software applications such as Power BI or Alteryx are examples of NCLC tools used by accountants and finance professionals today. Alteryx, for example, provides advanced tools for data wrangling, creating visualizations, and building predictive models without having to write code. Robotic process automation is another example of an application that provides NCLC technology to accountants and finance professionals. As discussed earlier, a number of these software applications offer varying levels of integration with Python, enabling users to extend the capabilities of the NCLC tools. When combined with a bit of knowledge of Python, these tools can put a lot of power in the hands of business users. 

Increasingly, AI and machine-learning technologies are being used to automatically generate code and accelerate the application development process. For example, Microsoft, the collaborative coding platform Github, and AI research and development company OpenAI recently launched an application called Copilot. The application analyzes computer code written by humans and then uses machine learning to generate new code to complete a program.9 Copilot supports multiple languages, such as Java, C, C++, C#, Python, and JavaScript. According to a study by the University of Cambridge, programmers spend about half of their programming time debugging code.10 By leveraging the power of AI, programmers can write code more efficiently and potentially with fewer bugs. 

Where to Learn More about Python 

CPAs looking to learn more about Python have several options. Training programs range from no-cost online programs to traditional computer science courses offered at colleges and universities. Freecodecamp.org offers 12 certification tracks of 300 hours each in computer programming at no cost. Three of the certifications focus on Python: Scientific Computing with Python, Data Analysis with Python, and a more advanced Machine Learning with Python. To earn each certification, learners must complete five projects that demonstrate mastery of the material. There are other low-cost options that can supplement Python learning, such as Coursera, Udemy, datacamp.com, and codecademy.com. Many colleges and universities with computer science departments may also offer courses in Python programming that can be taken as stand-alone courses for continuing learners or as part of a degree program. 

In today’s digital era, the ability to derive insight from vast amounts of data to enhance decision-making is vital. CPAs with basic to intermediate Python programming skills could add significant value to their organizations by building customized applications for advanced data analytics in a variety of accounting and financial management functions. Python’s open-source platform and versatility make it a particularly relevant language for CPAs to learn about, or at least gain an understanding of its basic fundamentals.  

1 AICPA’s Audit Data Standards website, www.aicpa.org/interestareas/frc/assuranceadvisoryservices/auditdatastandards.html, and “Audit Data Standard and Audit Data Analytics Working Group,” AICPA (March 2019). www.aicpa.org/content/dam/aicpa/interestareas/frc/assuranceadvisoryservices/downloadabledocuments/ads-instructional-paper-python.pdf
2 “Why Should CPAs Code?” CPA Canada (January 2020). www.cpacanada.ca/-/media/site/operational/rg-research-guidance-and-support/docs/02355-rg-why-should-cpas-code-jan-2020.pdf
3 “Coding: As a Professional Accountant, Why You Should be Interested,” Association of Chartered Certified Accountants (ACCA) (July 2021). www.accaglobal.com/gb/en/technical-activities/technical-resources-search/2021/july/coding-as-a-prof-accountant.html
4 A 2018 survey among accounting faculty ranked programming languages least important behind Excel, data analytics, and statistical and database tools – Ann C. Dzuranin, Janet R. Jones, and Renee M. Olvera, “Infusing data analytics into the accounting curriculum: A framework and insights from faculty,” Journal of Accounting Education, 43, pgs. 24-39 (2018). A 2019 study among accounting professionals ranked programming (such as using R, Java, Python) as the least important analytical skill in the future behind advanced Excel functions, visualization, computer-assisted audit tools, database, infrastructure, and statistics tools – William D. Brink and M. Dale Stoel, “Chapter 2: Analytical knowledge, skills, and abilities for accounting graduates,” Advances in Accounting Education: Teaching and Curriculum Innovations, Vol. 22, pgs. 23-43 (2019). 
5 Arkesh Sharma, “Connecting Python with SAP (step-by-step guide),” SAP Community blog (June 9, 2020). https://blogs.sap.com/2020/06/09/connecting-python-with-sap-step-by-step-guide/
6 https://pypi.org/project/yfinance/ and https://pypi.org/project/sec-api/
7 https://en.wikipedia.org/wiki/Matplotlib 
8 https://help.alteryx.com/20213/designer/python-tool 
9 Bryan Walsh, “Programs that Write Programs,” Axios (July 2, 2021). www.axios.com/github-openai-copilot-automated-programming-1d493967-9497-49ee-8881-8a9172820252.html
10 T. Britton, L. Jeng, G. Carver, P. Cheak, and T. Katzenellenbogen, “Reversible Debugging Software - Quantify the time and cost saved using reversible debuggers,” University of Cambridge Judge Business School. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.370.9611&rep=rep1&type=pdf


J. L. “John” Alarcon, CPA, CGMA, CITP, is a principal at BEARN LLC in Philadelphia and a member of the Pennsylvania CPA Journal Editorial Board. He can be reached at john.alarcon@bearnllc.com.

Kevin C. Moffitt, PhD, is an associate professor in the accounting and information systems department at the Rutgers Business School in Newark, N.J. He can be reached at kevin.moffitt@business.rutgers.edu. 

Cory Ng, CPA, DBA, CGMA, is an associate professor of instruction in accounting at the Fox School of Business at Temple University in Philadelphia and is the chair of the Pennsylvania CPA Journal Editorial Board. He can be reached at cory.ng@temple.edu.

Load more comments
New code
Comment by from