2022-02-26

Reduce size of snowflake-connector-python[pandas] module

I am trying to create a lambda function in AWS which connects to a Snowflake database. For this I need the snowflake-connector-python[pandas] package (https://docs.snowflake.com/en/user-guide/python-connector-pandas.html), which together with all of its dependencies has a size of over 250 MB uncompressed (around 280 MB). This is an issue because AWS lambda allows a maximum of 250 MB of dependencies (using AWS layers).

The size of the package is quite surprising, looking at the dependencies the biggest offenders are pyarrow (around 80 MB), pandas (around 60 MB), and numpy (around 40 MB). Is there a way to reduce the size of the whole package, installing only the relevant parts, so as to reduce the size to below 250 MB? Namely I need to be able to connect, read, and write to Snowflake, nothing fancy.

I know that there are other options in these cases, such as containers, however I would like to avoid this if possible.



No comments:

Post a Comment