Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fill empty values subcommand #117

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alexrudy
Copy link

This adds a fill subcommand. I'm not strongly opinionated about this being in xsv, but I find it useful, so I thought I would make the PR and see.

Motivation: I have CSVs (usually from DB dumps) with empty values in certain columns, which I want to fill forwards or backwards using the next valid value.

Implementation: Iterate through rows, gather and store the last valid value, and fill in missing values as they are found. The --backfill argument fills empty values which occur before the first missing value, and so must buffer.

There is also a --groupby feature, something that XSV doesn't do very often, but which I find helpful in this case, to group by e.g. (in application logs) user_id, if the filled value should only apply to each user, or by app_version if the value really applies to each app version. This does increase the memory overhead by the order of the number of groups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant