Pulse Plus

PhonePe recently released Pulse repo from their payment data. It was hard to get an overview of the data without doing some data transformation. The data is eight levels deep, nested, and multiple files for similar purpose data. Hard to do any command-line aggregate queries for data exploration. It’s hard to do any analysis with 2000+ files. So I created an SQLite database of the data using python sqlite-utils. The SQLite database aggregated data and top data in 5 tables - aggregated_user, aggregated_user_device, aggregated_transaction, top_user, top_transaction. [Read More]