Greetings! I had a task on some project to read large csv files on lambda. As you may know csv is better to read from stream and I was wondering if there’s a way to do that on lambda.
Lambda has rough limits on filestorage (5mb or smt like that) so the only way to process csv is read it from url (you can upload it to s3).
I’ve created some ready to use function you can copy (typescript):
export const urlToCsv = async <T = Record<string, string>>(fileUrl: string, callback: (line: T) => Promise<void>): Promise<any> => {
return new Promise(resolve => {
const rowPromises = [];
axios.get(fileUrl, {responseType: "stream"})
.then(({data}) => {
data.pipe(
parse({
trim: true,
columns: true,
delimiter: ",",
skip_empty_lines: true,
})
).on('data', async (line: T) => {
console.log('Read csv line', line);
rowPromises.push(callback(line));
}).on('end', () => {
Promise.all(rowPromises).then(() => resolve(true));
});
});
});
}
You’ll need yarn add csv
and yarn add axios
to make it work as it depend on this 2 libraries.
Usage sample:
await urlToCsv<{column1: string, column2: string}>(fileUrl, async (line) => {
// do whatever you want with the line. Its also typehinted from generic
});
Click to rate this post!
[Total: 0 Average: 0]