41
loading...
This website collects cookies to deliver better user experience
Scaling, in general, is starting out with a small solution and growing it into a bigger solution.
What are your goals? You must decide beforehand, what is good enough? Over-engineering is a trap, all engineers are familiar with. Don't over-engineer!
Measure your code? To treat a patient, a doctor has to determine what the illness is? Likewise, you need to measure your code for bottlenecks before deciding on a scaling strategy.
There are many tools that can use to profile your python code. Scalene is one of the most complete measuring tools available at the moment.
Increase code efficiency: Python is designed for ease of use and easy extension, but not performance. As a developer, the onus is on you to do more work so that the application executes less code. Whenever possible use vectorized library functions instead of loops.
Python is successful in data science because of the pre-compiled code offered by data-appropriate libraries in the pydata stack such as pandas and numpy.
Adding parallelism: You can achieve parallelism on a single machine or by using multiple machines. In principle, you want to stay on a single machine as much as possible as distributed computing increased complexity because of the following challenges: