Skip to content

Automated Twitter URL processing for the excellent thread-keeper

License

Notifications You must be signed in to change notification settings

telezoic/thread-keeper-automator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

thread-keeper-automator

Automated Twitter URL processing for the excellent thread-keeper from the Harvard Library Innovation Laboratory

  1. Deploy the thread-keeper to your server
  2. Download an archive of your twitter account
  3. Change the extension .js => .json and validate and the /twitter-xxxx-xxxx/data/tweets.js file
  4. Extract the tweet.id and the .tweet.created_at from the file:
  {
    "tweet" : {
      "edit_info" : {
        "initial" : {
          "editTweetIds" : [
            "1593006617125326848"
          ],
          "editableUntil" : "2022-11-16T22:52:28.811Z",
          "editsRemaining" : "5",
          "isEditEligible" : false
        }
      },
      "retweeted" : false,
      "source" : "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Twitter Web App</a>",
      "entities" : {
        "hashtags" : [ ],
        "symbols" : [ ],
        "user_mentions" : [
          {
            "name" : "CRKN RCDR",
            "screen_name" : "CRKN_RCDR",
            "indices" : [
              "3",
              "13"
            ],
            "id_str" : "813991658",
            "id" : "813991658"
          },
          {
            "name" : "McGill Library",
            "screen_name" : "McGillLib",
            "indices" : [
              "36",
              "46"
            ],
            "id_str" : "21223663",
            "id" : "21223663"
          }
        ],
        "urls" : [ ]
      },
      "display_text_range" : [
        "0",
        "140"
      ],
      "favorite_count" : "0",
      "id_str" : "1593006617125326848",
      "truncated" : false,
      "retweet_count" : "0",
      "id" : "1593006617125326848",
      "created_at" : "Wed Nov 16 22:22:28 +0000 2022",
      "favorited" : false,
      "full_text" : "RT @CRKN_RCDR: 🗺In partnership with @McGillLib, we have added approximately 22,000 digitized Canadian maps to the Canadiana Collection. It’…",
      "lang" : "en"
    }
  },

Extract example using JQ:

cat tweets.json | jq '.[].tweet.id' > tweetsID.json

cat tweets.json | jq '.[].tweet.created_at' > tweetsdates.json

and dump in a csv . . .

  1. Clean up the .csv (see example.csv with added columns for date sorting) - I split the csv into multiple sheets and pulled tweets by year.

  2. Put the the .csv in the same directory as the python script

  3. Execute the script

python ffHead.py (GIU) or python ffHeadless.py (Headless)

  1. Combine the pdfs. I used Ghostscript here.

gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=COMBINEDPDFS.pdf *.pdf

(combining files will likely break the signatures applied to each pdf) . . . this is something to be further explored.

  1. Sign the new combined document (if you like). I have been using open-pdf-sign

About

Automated Twitter URL processing for the excellent thread-keeper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages