Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import multiple files to single table #75

Open
bsilverm opened this issue Apr 9, 2018 · 2 comments
Open

Import multiple files to single table #75

bsilverm opened this issue Apr 9, 2018 · 2 comments

Comments

@bsilverm
Copy link

bsilverm commented Apr 9, 2018

Is there a way to import multiple files in to the same table? My data is split in to multiple files due to its size. Running a command like this results in two tables:

textql -console -dlm=0x7c -header -sql "Select count() from myLog" myLog1.txt myLog2.txt

Running .schema in sqlite results in

CREATE TABLE [myLog]..
CREATE TABLE [myLog2]..

Ideally there would be one table containing the data from both files.

Thanks!

@runeimp
Copy link

runeimp commented May 9, 2019

Something like one of the following should work:

textql -console -dlm=0x7c -header -sql "SELECT (SELECT count() FROM myLog1) + (SELECT count() FROM myLog1) LIMIT 1" myLog1.txt myLog2.txt

or

textql -dlm=0x7c -header -output-file myLog.txt -sql 'SELECT * FROM myLog1 OUTER JOIN myLog1.ID = myLog2.ID' myLog1.txt myLog2.txt
textql -console -dlm=0x7c -sql "SELECT count() FROM myLog" myLog.txt

@abathur
Copy link

abathur commented Jul 20, 2019

If you can make some assumptions about the file and any header rows, it's probably going to be faster to do this before the logs hit textql. Not sure how often you work with files like this, but it might be meaningful since you say the data is split over multiple files due to size.

If none of the logs have headers, it could be as simple as using cat to join them into a new file, or pipe directly to textql. If your logs do have headers, you can use head and tail. I think the following example of piping them directly into textql should work in Unix or WSL/cygwin...

# if the logs have a stable schema, just stick the header in a file by itself
head -n1 myLog1.txt > myLog.header

tail -q -n +2 myLog*.txt | cat myLog.header - | textql -sql "select count(*) from stdin"
# Breakdown of above:
# 1. extract all but the first line of the log files with tail and pipe to cat
#   (-q flag removes filename separators, effectively concatenating them)
# 2. use cat to concatenate the header file with the output from tail (indicated by the "-" argument) and pipe to textql
# 3. run textql query against the "stdin" table. 
# (You can leave out "from stdin" for simple queries, i.e. `textql -sql "select *"` but I wanted to explicitly call it out here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants