coecms · HoWol76 · Oct 2, 2018 · Dec 12, 2018 · Dec 12, 2018 · Dec 12, 2018
diff --git a/_posts/2018-12-14_Fortran_Files.md b/_posts/2018-12-14_Fortran_Files.md
@@ -0,0 +1,9 @@
+---
+title: Fortran Binary Files
+layout: notebook
+notebook: 2018-12-14_Fortran_Files.html
+author: Holger Wolff
+excerpt: >-
+    A quick introduction how Fortran stores binary files and how to read them
+    using Python
+---
diff --git a/_posts/2019-09-03-Python_Animation.md b/_posts/2019-09-03-Python_Animation.md
@@ -0,0 +1,8 @@
+---
+title: Animating fields with Python
+layout: notebook
+notebook: 2019-09-03_Python_Animation.ipynp
+author: Holger Wolff
+excerpt: >-
+    Quick guide on creating animated gifs and mp4 videos of datasets
+---
diff --git a/notebooks/2018-12-14_Fortran_Files.ipynb b/notebooks/2018-12-14_Fortran_Files.ipynb
@@ -0,0 +1,333 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# How Fortran stores binary files"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Introduction\n",
+    "\n",
+    "Fortran is still the go-to language for number crunching."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Types of Fortran Files\n",
+    "\n",
+    "There are three different native ways for Fortran to store data in files:\n",
+    "\n",
+    "1. Formatted\n",
+    "2. Unformatted\n",
+    "3. Stream\n",
+    "\n",
+    "Then, there are libraries to store the data in specific formats, for example NetCDF.\n",
+    "\n",
+    "If you want to store complex data sets for a long time, I strongly recommend NetCDF or another dedicated data format. \n",
+    "We have detailed on this blog before how to write NetCDF files with Fortran and Python, and they have features like compression and documentation that are very beneficial.\n",
+    "\n",
+    "But what if the data isn't very complex and NetCDF would be an overkill? \n",
+    "Or if you received the data from someone else and they didn't bother with this?\n",
+    "\n",
+    "This blog post will help you with your task of reading the data."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## A few notes on best practices\n",
+    "\n",
+    "There are some good practices on how to deal with files in Fortran. \n",
+    "These make it easier to port data, because they will work on different systems.\n",
+    "\n",
+    "### Ensure that you know the kind of the variable\n",
+    "\n",
+    "If you write something like\n",
+    "\n",
+    "```fortran\n",
+    "integer :: ii\n",
+    "```\n",
+    "\n",
+    "you don't really know what kind of integer `ii` will be. \n",
+    "Often, you can set the default integer and real kind with compiler options, but it's far better to explicitly declare the kind in the code itself.\n",
+    "\n",
+    "Since Fortran 2003 -- and all compilers we use today are compatible with this -- you can use the intrinsic `iso_fortran_env` module to get the proper kinds:\n",
+    "\n",
+    "```fortran\n",
+    "use iso_fortran_env, only: int32, real64\n",
+    "implicit none\n",
+    "integer(kind=int32) :: ii\n",
+    "real(kind=real64) :: x(10, 100)\n",
+    "```\n",
+    "\n",
+    "In old code, you might find statements like:\n",
+    "\n",
+    "```fortran\n",
+    "integer*4 ii        ! DO NOT DO THIS\n",
+    "```\n",
+    "\n",
+    "This syntax has *never* been standard, and I strongly discourage you from using it.\n",
+    "Slightly better, but still wrong, is this:\n",
+    "\n",
+    "```fortran\n",
+    "integer(kind=4) :: ii   ! Still not good\n",
+    "```\n",
+    "\n",
+    "There is no guarantee that every compiler will use the same kind values for the same variable types.\n",
+    "If for some reason you can not use `iso_fortran_env`, use the `selected_int_kind` and `selected_real_kind` methods instead:\n",
+    "\n",
+    "```fortran\n",
+    "integer, parameter :: real64 = selected_real_kind(15, 307)\n",
+    "real(kind=real64) :: x(10, 100)\n",
+    "```\n",
+    "\n",
+    "See the table below for which type you need\n",
+    "\n",
+    "| bytes | int name | integer kind | integer max |   |  | real name | real kind |\n",
+    "|------|-------|------|------|-----|    |-------|-------|-------|\n",
+    "|  2  | `int16` | `selected_int_kind(3)` |  127 | | | |\n",
+    "|  4  | `int32` | `selected_int_kind(5)` | > 2*10^9 | | | `real32` | `selected_real_kind(6, 37)` |\n",
+    "|  8  | `int64` | `selected_int_kind(10)` | > 9*10^18 | | | `real64` | `selected_real_kind(15, 307)` |\n",
+    "| 16  | ----    | `selected_int_kind(19)` | > 10^38 | | | `real128` | `selected_real_kind(33, 4931)` |\n",
+    "\n",
+    "Note that `iso_fortran_env` does not have a named type `int128`, though your compiler might have it. \n",
+    "Some compilers also have a 10-byte real kind.\n",
+    "\n",
+    "### newunit\n",
+    "\n",
+    "Whenever you interact with a file, you need a unit, an integer value that references a specific open file.\n",
+    "Some I/O streams, specifically Standard Input, Standard Output, and Standard Error have compiler dependent values for these units, which unfortunately are not standardised.\n",
+    "\n",
+    "Keeping track of these values while remaining compiler-agnostic is getting a bit confusing.\n",
+    "Fortunately, there's an option for that: `newunit`.\n",
+    "\n",
+    "Instead of using a hardcoded integer value, declare an integer variable with a meaningful name, then open the file with `newunit=` instead of `unit=` parameter:\n",
+    "\n",
+    "```fortran\n",
+    "integer :: output_handle\n",
+    "...\n",
+    "open(newunit=output_handle, file='data.dat', ...)\n",
+    "...\n",
+    "write(output_handle, *) values(:, i)\n",
+    "...\n",
+    "close(output_handle)\n",
+    "```\n",
+    "\n",
+    "A new, unused value is assigned every time you open the file, and you don't have to worry about interfering file handles any more."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Stream\n",
+    "\n",
+    "Stream output has been part of Fortran 2003 and later. \n",
+    "\n",
+    "The binary representation of the data is written directly to the file, without any metadata.\n",
+    "\n",
+    "### Fortran writing stream data\n",
+    "\n",
+    "```fortran\n",
+    "program write_stream\n",
+    "    use iso_fortran_env, only: int16\n",
+    "    implicit none\n",
+    "    integer(kind=int16) :: ii\n",
+    "    integer :: output_handle\n",
+    "    open(newunit=output_handle, file='stream_data.dat', action='write',   &\n",
+    "         status='replace', access='stream', format='unformatted')\n",
+    "    write(output_handle) [(ii, ii=1, 10)]\n",
+    "    write(output_handle) \"Hello World\"\n",
+    "    close(output_handle)\n",
+    "end program write_stream\n",
+    "``` "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "00000000  01 00 02 00 03 00 04 00  05 00 06 00 07 00 08 00  |................|\r\n",
+      "00000010  09 00 0a 00 48 65 6c 6c  6f 20 57 6f 72 6c 64     |....Hello World|\r\n",
+      "0000001f\r\n"
+     ]
+    }
+   ],
+   "source": [
+    "!hexdump -C stream_data.dat"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This has written the 16-bit values from 1 to 10, followed by the ascii values for \"Hello World\".\n",
+    "\n",
+    "### Reading it into Python\n",
+    "\n",
+    "If it were purely one large array, it would be very easy to read it into Python:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([    1,     2,     3,     4,     5,     6,     7,     8,     9,\n",
+       "          10, 25928, 27756,  8303, 28503, 27762], dtype=int16)"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import numpy as np\n",
+    "np.fromfile('stream_data.dat', '<i2')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The `np.fromfile` method reads the data stream in as-is, and iterprets the values according to the datatype you gave it, in the above case little-endian 2-byte integer.\n",
+    "\n",
+    "For an overview of possible data types, see [here](https://docs.scipy.org/doc/numpy/reference/arrays.interface.html#python-side).\n",
+    "\n",
+    "The integer values are correctly read in, but of course the 'H' and 'e' get mashed into a single integer value of 25928, 'll' becomes 27759, and so forth.\n",
+    "\n",
+    "Still, this might be the simplest way to transfer a single array bit-correct to python."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Unformatted sequential\n",
+    "\n",
+    "There is no standardised method to store unformatted sequential data, and the exact format might vary between different compilers and platforms.\n",
+    "\n",
+    "That said, most compilers seem to store it in a similar way by now.\n",
+    "\n",
+    "### Fortran Write\n",
+    "\n",
+    "```fortran\n",
+    "program write_unformatted\n",
+    "    use iso_fortran_env\n",
+    "    implicit none\n",
+    "    integer(kind=int16) :: ii\n",
+    "    integer :: output_handle\n",
+    "    open(newunit=output_handle, file='unformatted_data.dat', form='unformatted', &\n",
+    "        status='replace', action='write', access='sequential')\n",
+    "    write(output_handle) [(ii, ii=1, 10)]\n",
+    "    write(output_handle) \"Hello World\"\n",
+    "    close(output_handle)\n",
+    "end program write_unformatted\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "00000000  14 00 00 00 01 00 02 00  03 00 04 00 05 00 06 00  |................|\r\n",
+      "00000010  07 00 08 00 09 00 0a 00  14 00 00 00 0b 00 00 00  |................|\r\n",
+      "00000020  48 65 6c 6c 6f 20 57 6f  72 6c 64 0b 00 00 00     |Hello World....|\r\n",
+      "0000002f\r\n"
+     ]
+    }
+   ],
+   "source": [
+    "!hexdump -C unformatted_data.dat"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can still see the values of 1 through 10 (`01 00` through `0a 00`), but you can also see that it's no longer the first value. \n",
+    "It starts with `14 00 00 00`, or 20, which is the number of bytes that make this list up. After the array, the 20 is repeated, in case you read in reverse.\n",
+    "\n",
+    "Next comes `0b 00 00 00`, or 11 -- exactly the number of bytes in \"Hello World\", again followed by a repeat of the record header 11.\n",
+    "\n",
+    "### Python read\n",
+    "\n",
+    "To read this data in Python, you need to know the data type of the header, almost always an unsigned int, and usually 4 bytes in length:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[ 1  2  3  4  5  6  7  8  9 10]\n",
+      "b'Hello World'\n"
+     ]
+    }
+   ],
+   "source": [
+    "from scipy.io import FortranFile\n",
+    "ff=FortranFile('unformatted_data.dat', 'r', '<u4')\n",
+    "print(ff.read_record('<i2'))\n",
+    "print(b''.join(ff.read_record('S1')))\n",
+    "ff.close()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}