AWS lambda (serverless) read CSV file in stream

Greetings! I had a task on some project to read large csv files on lambda. As you may know csv is better to read from stream and I was wondering if there’s a way to do that on lambda.

Lambda has rough limits on filestorage (5mb or smt like that) so the only way to process csv is read it from url (you can upload it to s3).

I’ve created some ready to use function you can copy (typescript):

export const urlToCsv = async <T = Record<string, string>>(fileUrl: string, callback: (line: T) => Promise<void>): Promise<any> => {
    return new Promise(resolve => {
        const rowPromises = [];

        axios.get(fileUrl, {responseType: "stream"})
            .then(({data}) => {
                data.pipe(
                    parse({
                        trim: true,
                        columns: true,
                        delimiter: ",",
                        skip_empty_lines: true,
                    })
                ).on('data', async (line: T) => {
                    console.log('Read csv line', line);
                    rowPromises.push(callback(line));
                }).on('end', () => {
                    Promise.all(rowPromises).then(() => resolve(true));
                });
            });
    });
}

You’ll need yarn add csv and yarn add axios to make it work as it depend on this 2 libraries.

Usage sample:

await urlToCsv<{column1: string, column2: string}>(fileUrl, async (line) => {
// do whatever you want with the line. Its also typehinted from generic
});
Click to rate this post!
[Total: 0 Average: 0]

You May Also Like

About the Author: deniskoronets

Leave a Reply

Your email address will not be published. Required fields are marked *