Facebook has open-sourced Pysa, an internal tool used on Instagram to detect and fix bugs in the huge Python codebase of the app. Pysa can automatically identify vulnerable code snippets written by Facebook engineers before they are integrated into the social network’s systems.
It is a static analyzer tool meaning it works by scanning code in a “static” form before the code is compiled. It hunts for common patterns that are usually observed in bugs and flags the potential issues in the code.
Facebook developed Pysa internally and claims that the tool has now reached maturity through continuous improvement. It says that in the first half of 2020, Pysa was able to detect 44% of all security bugs in Instagram’s server-side Python code.
Pysa has been exclusively made to analyze code written in Python programming language. So, that creates certain limitations to where the tool can be used but given that Python’s popularity has been increasing in the last few years, it is quite useful.
How does Pysa detect security issues in Python codebase?
Pysa detects security issues by tracking the data flow through an application and checks if it ends up somewhere it’s not supposed to.
For instance, developers can use this tool to check if the input a user enters into a public website form is sent to the backend database directly without getting scanned. It helps in identifying loopholes that hackers can manipulate to inject malicious code into the application’s database.
While that sounds pretty, in reality, it’s not because the data doesn’t always take a direct route inside an application. Any input entered into a website form might have to pass through multiple components before it reaches the vulnerable backend database. In such cases, finding security weak points can be quite difficult. This usually happens in the complex codebase of bigger platforms with a large number of components.
To solve this problem, Pysa analyzes code layer by layer. It performs “iterative rounds of analysis to build summaries to determine which functions return data from a source and which functions have parameters that eventually reach a sink.”
This open-source tool for Python has fine-tuned after months of internal testing to find the vulnerable code specific to common security issues like cross-site scripting, remote code executions, SQL injections, etc.
You can learn more about Pysa here.
Pysa is fast and works well on large codebase
The Python tool has been built for speed along with the capability of going over millions of lines of code from anywhere between 30 minutes and hours. This is why Pysa can identify bugs in almost real-time. It also helps Facebook’s developer teams feel safe and confident about using this tool in their regular workflows and routines without worrying whether it would delay deadlines for shipping code.