Skip to content

Commit

Permalink
#19 work
Browse files Browse the repository at this point in the history
  • Loading branch information
Dcosthephalump committed Jul 20, 2023
1 parent 72bd417 commit de36142
Showing 1 changed file with 164 additions and 1 deletion.
165 changes: 164 additions & 1 deletion manuscriptFiles.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"metadata": {
"tags": []
},
Expand Down Expand Up @@ -456,6 +456,169 @@
"uploader.value"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## saveTranscript\n",
"\n",
"Like ```saveImage```, the ```saveTranscript``` function takes in:\n",
"- ```files```: a dictionary with data from a ```FileUploader``` widget\n",
"- ```targetDirectory```: a path to a directory for saving the relevant files"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#| export\n",
"def saveTranscript(files:dict, targetDirectory):\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "67cb3e95f7fb46bf98037416306f03e9",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"FileUpload(value=(), description='Upload Manuscript Transcripts', layout=Layout(height='auto', width='auto'), …"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#| hide\n",
"import ipywidgets as widgets\n",
"\n",
"uploader = widgets.FileUpload(\n",
" accept = '', # Accepted file extension e.g. '.txt', '.pdf', 'image/*', 'image/*,.pdf'\n",
" multiple = True, # True to accept multiple files upload else False\n",
" description = 'Upload Manuscript Transcripts',\n",
" layout = widgets.Layout(height='auto', width='auto')\n",
")\n",
"uploader"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"({'name': '15_01_0053_0006_f_3r_res.xml',\n",
" 'type': 'text/xml',\n",
" 'size': 18246,\n",
" 'content': <memory at 0x7fee80ad87c0>,\n",
" 'last_modified': datetime.datetime(2021, 7, 9, 16, 4, 20, tzinfo=datetime.timezone.utc)},\n",
" {'name': '15_01_0053_0007_f_3v_res.xml',\n",
" 'type': 'text/xml',\n",
" 'size': 27037,\n",
" 'content': <memory at 0x7fee80ad8880>,\n",
" 'last_modified': datetime.datetime(2021, 7, 9, 16, 4, 20, tzinfo=datetime.timezone.utc)},\n",
" {'name': '15_01_0053_0008_f_4r_res.xml',\n",
" 'type': 'text/xml',\n",
" 'size': 26795,\n",
" 'content': <memory at 0x7fee80ad8940>,\n",
" 'last_modified': datetime.datetime(2021, 7, 9, 16, 4, 20, tzinfo=datetime.timezone.utc)})"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"uploader.value"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<Element '{http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15}PcGts' at 0x7fee806d7240>\n"
]
}
],
"source": [
"#| hide\n",
"import xml.etree.ElementTree as ET\n",
"input_stream = ET.parse(io.BytesIO(uploader.value[0]['content']))\n",
"root = input_stream.getroot()\n",
"\n",
"print(root)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'xml.etree.ElementTree.Element'>\n",
"\n",
" \n"
]
}
],
"source": [
"print(type(root))\n",
"\n",
"print(root[1][0].text)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"<function Element.keys()>"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"root.keys"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down

0 comments on commit de36142

Please sign in to comment.