Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing json file #430

Open
SimVid opened this issue Mar 19, 2021 · 3 comments
Open

Importing json file #430

SimVid opened this issue Mar 19, 2021 · 3 comments

Comments

@SimVid
Copy link

SimVid commented Mar 19, 2021

Hi folks - I'm having an issue importing a json file. I've added the name of the bin and name of the json file to the import-jsondump.php script:

// specify the name of the bin here
$bin_name = 'leadershippaper';
// specify dir with the user timelines (json)
$dir = 'sample';
// set type of dump ('import follow' or 'import track')
$type = 'import track';
// if 'import track', specify keywords for which data was captured
$queries = array();

The sample.json file is in the 'import' directory, and when I run the script this is what I get:

root@server:/var/www/dmi-tcat/import# php import-jsondump.php
[debug] querybin_id = 15

Number of tweets: 0
Unique tweets: 0
Unique users: 0
Processed 0 tweets!

Total number of timelines: 0
Valid timelines: 0
Invalid timelines: 0
Populated timelines: 0
Empty timelines: 0

Help will be much appreciated.

@ErikBorra
Copy link
Member

How did you retrieve that json and what format does it have? Could you provide a snippet of the json (starting at the top)?

@SimVid
Copy link
Author

SimVid commented Mar 19, 2021

thanks, Erik!
I didn't retrieve it myself but the explanation to the data says that Twarc was used to hydrate the retrieved tweet IDs.

Below is a snippet from the json file:

[{"created_at": "Fri Jan 24 16:24:23 +0000 2014", "id": 426752350032650240, "id_str": "426752350032650240", "full_text": "Thanks again @jcostik! The power of #opensource @askmanny @mrmikelawson #wearenotwaiting #t1d", "truncated": false, "display_text_range": [0, 93], "entities": {"hashtags": [{"text": "opensource", "indices": [36, 47]}, {"text": "wearenotwaiting", "indices": [72, 88]}, {"text": "t1d", "indices": [89, 93]}], "symbols": [], "user_mentions": [{"screen_name": "jcostik", "name": "John Costik", "id": 71270503, "id_str": "71270503", "indices": [13, 21]}, {"screen_name": "askmanny", "name": "Manny Hernandez", "id": 5986922, "id_str": "5986922", "indices": [48, 57]}, {"screen_name": "MrMikeLawson", "name": "Mike Lawson", "id": 15053068, "id_str": "15053068", "indices": [58, 71]}], "urls": []}, "source": "<a href="https://tapbots.com/software/tweetbot/mac" rel="nofollow">Tweetbot for Mac", "in_reply_to_status_id": 426741704226766848, "in_reply_to_status_id_str":

@SimVid
Copy link
Author

SimVid commented Mar 22, 2021

Hi again,

I keep trying and this is where I've got at: when I try to run the file covering all 4 years - it generates the following errors:

root@server:/var/www/dmi-tcat/import# php import-jsondump.php
processing ../import/attempt1.json
.....................................................................................................................................2
processing ../import/trying.json
PHP Warning: Invalid argument supplied for foreach() in /var/www/dmi-tcat/capture/common/functions.php on line 1771
PHP Warning: array_key_exists() expects parameter 2 to be array, null given in /var/www/dmi-tcat/capture/common/functions.php on line 1782
PHP Warning: array_key_exists() expects parameter 2 to be array, null given in /var/www/dmi-tcat/capture/common/functions.php on line 1785
PHP Warning: array_key_exists() expects parameter 2 to be array, null given in /var/www/dmi-tcat/capture/common/functions.php on line 1786
PHP Fatal error: Uncaught PDOException: SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ')' at line 1 in /var/www/dmi-tcat/capture/common/functions.php:1920
Stack trace:
#0 /var/www/dmi-tcat/capture/common/functions.php(1920): PDOStatement->execute()
#1 /var/www/dmi-tcat/import/import-jsondump.php(69): Tweet->isInBin('attempt1')
#2 /var/www/dmi-tcat/import/import-jsondump.php(45): process_json_file_timeline('../import/tryin...', Object(PDO))
#3 {main}
thrown in /var/www/dmi-tcat/capture/common/functions.php on line 1920

I tried to import a smaller file covering 3 months - it worked and this is what I've got:

_root@server:/var/www/dmi-tcat/import# php import-jsondump.php
The query bin 'attempt1' already exists. Are you sure you want to add tweets to 'attempt1'? (yes/no)
yes
processing ../import/attempt1.json
.....................................................................................................................................1
[debug] querybin_id = 24
[debug] UPDATE tcat_query_bins_periods SET starttime = :starttime, endtime = :endtime WHERE querybin_id = :querybin_id (24, 2014-01-16 19:29:38, 2018-11-07 22:12:58)

Number of tweets: 0
Unique tweets: 0
Unique users: 0
Processed 0 tweets!

Total number of timelines: 1
Valid timelines: 0
Invalid timelines: 0
Populated timelines: 0
Empty timelines: 0_

Any help will be appreciated...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants