Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failed: (dsid > 0), function attach_dimscales, file nc4hdf.c, line 1417. #261

Closed
ryofurue opened this issue Jul 25, 2024 · 16 comments

Comments

@ryofurue
Copy link

ryofurue commented Jul 25, 2024

Describe the bug

When I run the test program below, I get an assertion error as shown near the bottom of this report.

To Reproduce

The following test program needs an existing netCDF file tmp-in.nc, which I'll attach, if possible. ( I've found netCDF isn't allowed. I've zipped it and attached it at the end of this report.) I tried to create a sample netCDF file from within the test program but the simple netCDF file I created didn't result in any error.

The test program try-ncdatasets.jl is

using NCDatasets

infnam = "tmp-in.nc"
oufnam = "tmp-out.nc"

NCDataset(infnam, "r") do ds
  myvar    = ds["MYVAR"]
  tax     = ds["TIME1_10"]
  NCDataset(oufnam, "c") do ou
    defVar(ou, "tax", tax)
    defVar(ou, "myvar", myvar[:,1], ("tax",))
  end
end

Expected behavior

My original purpose is to subset an existing netCDF file.

Environment

  • operating system: macOS 14.5
  • Julia version: julia 1.10.4 (2024-06-04), which is the latest one from juliaup update.
  • Output of the julia command versioninfo(): Shown in the following section.
  • NCDatasets version: v0.14.4
  • Output of using Pkg; Pkg.status(mode=PKGMODE_MANIFEST): Shown in the following section.

Full output

julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 8 × Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)
julia> using Pkg; Pkg.status(mode=PKGMODE_MANIFEST)
Status `~/.julia/environments/v1.10/Manifest.toml`
  [621f4979] AbstractFFTs v1.5.0
  [1520ce14] AbstractTrees v0.4.5
  [79e6a3ab] Adapt v4.0.4
  [66dad0bd] AliasTables v1.1.3
  [27a7e980] Animations v0.4.1
  [67c07d97] Automa v1.0.4
  [13072b0f] AxisAlgorithms v1.1.0
  [39de3d68] AxisArrays v0.4.7
  [d1d4a3ce] BitFlags v0.1.9
  [fa961155] CEnum v0.5.0
  [179af706] CFTime v0.1.3
  [336ed68f] CSV v0.10.14
  [159f3aea] Cairo v1.0.5
  [13f3f980] CairoMakie v0.12.5
  [49dc2e85] Calculus v0.5.1
  [d360d2e6] ChainRulesCore v1.24.0
  [944b1d66] CodecZlib v0.7.5
  [a2cac450] ColorBrewer v0.4.0
  [35d6a980] ColorSchemes v3.26.0
  [3da002f7] ColorTypes v0.11.5
  [c3611d14] ColorVectorSpace v0.10.0
  [5ae59095] Colors v0.12.11
  [1fbeeb36] CommonDataModel v0.3.6
  [bbf7d656] CommonSubexpressions v0.3.0
  [34da2185] Compat v4.15.0
  [f0e56b4a] ConcurrentUtilities v2.4.2
  [187b0558] ConstructionBase v1.5.6
  [d38c429a] Contour v0.6.3
  [a8cc5b0e] Crayons v4.1.1
  [717857b8] DSP v0.7.9
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.6.1
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [927a84f5] DelaunayTriangulation v1.0.5
  [8bb1440f] DelimitedFiles v1.9.1
  [163ba53b] DiffResults v1.1.0
  [b552c78f] DiffRules v1.15.1
⌅ [3c3547ce] DiskArrays v0.3.23
  [31c24e10] Distributions v0.25.109
  [ffbed154] DocStringExtensions v0.9.3
  [fa6b7ba4] DualNumbers v0.6.8
  [4e289a0a] EnumX v1.0.4
  [429591f6] ExactPredicates v2.2.8
  [460bff9d] ExceptionUnwrapping v0.1.10
  [411431e0] Extents v0.1.3
  [8f5d6c58] EzXML v1.2.0
  [c87230d0] FFMPEG v0.4.1
  [7a1cc6ca] FFTW v1.8.0
  [5789e2e9] FileIO v1.16.3
  [8fc22ac5] FilePaths v0.8.3
  [48062228] FilePathsBase v0.9.21
  [1a297f60] FillArrays v1.11.0
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
  [f6369f11] ForwardDiff v0.10.36
  [b38be410] FreeType v4.1.1
  [663a7486] FreeTypeAbstraction v0.10.3
⌃ [28b8d3ca] GR v0.73.5
  [cf35fbd7] GeoInterface v1.3.5
  [5c1252a2] GeometryBasics v0.4.11
  [a2bd30eb] Graphics v1.1.2
  [3955a311] GridLayoutBase v0.11.0
  [42e2da0e] Grisu v1.0.2
  [cd3eb016] HTTP v1.10.8
  [34004b35] HypergeometricFunctions v0.3.23
  [2803e5a7] ImageAxes v0.6.11
  [c817782e] ImageBase v0.1.7
  [a09fc81d] ImageCore v0.10.2
  [82e4d734] ImageIO v0.6.8
  [bc367c6b] ImageMetadata v0.9.9
  [9b13fd28] IndirectArrays v1.0.0
  [d25df0c9] Inflate v0.1.5
  [842dd82b] InlineStrings v1.4.2
  [a98d9a8b] Interpolations v0.15.1
  [d1acc4aa] IntervalArithmetic v0.22.14
  [8197267c] IntervalSets v0.7.10
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [f1662d9f] Isoband v0.1.1
  [c8e1da08] IterTools v1.10.0
  [82899510] IteratorInterfaceExtensions v1.0.0
  [1019f520] JLFzf v0.1.7
  [692b3bcd] JLLWrappers v1.5.0
  [682c06a0] JSON v0.21.4
  [b835a17e] JpegTurbo v0.1.5
  [5ab0869b] KernelDensity v0.6.9
  [8ac3fa9e] LRUCache v1.6.1
  [b964fa9f] LaTeXStrings v1.3.1
  [23fbe1c1] Latexify v0.16.4
  [8cdb02fc] LazyModules v0.3.1
  [2ab3a3ac] LogExpFunctions v0.3.28
  [e6f89c97] LoggingExtras v1.0.3
  [1914dd2f] MacroTools v0.5.13
  [ee78f7c6] Makie v0.21.5
  [20f20a25] MakieCore v0.8.4
  [dbb5928d] MappedArrays v0.4.2
  [7eb4fadd] Match v2.1.0
  [0a4f8689] MathTeXEngine v0.6.1
  [739be429] MbedTLS v1.1.9
  [442fdcdd] Measures v0.3.2
  [e1d29d7a] Missings v1.2.0
  [5cb8414e] ModuleInterfaceTools v1.0.1
  [e94cdb99] MosaicViews v0.3.4
  [bdf0d083] MultiFloats v2.0.2
  [85f8d34a] NCDatasets v0.14.4
  [77ba4419] NaNMath v1.0.2
  [f09324ee] Netpbm v1.1.1
  [510215fc] Observables v0.5.5
  [6fe1bfb0] OffsetArrays v1.14.1
  [52e1d378] OpenEXR v0.3.2
  [4d8831e6] OpenSSL v1.4.3
  [bac558e1] OrderedCollections v1.6.3
  [90014a1f] PDMats v0.11.31
  [f57f5aa1] PNGFiles v0.4.3
  [19eb6ba3] Packing v0.5.0
  [5432bcbf] PaddedViews v0.5.12
  [69de0a69] Parsers v2.8.1
  [b98c9c47] Pipe v1.3.0
  [eebad327] PkgVersion v0.3.3
  [ccf2f8ad] PlotThemes v3.2.0
  [995b91a9] PlotUtils v1.4.1
  [91a5bcdd] Plots v1.40.5
  [647866c9] PolygonOps v0.1.2
  [f27b6e38] Polynomials v4.0.11
  [2dfb63ee] PooledArrays v1.4.3
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.3.2
  [92933f4c] ProgressMeter v1.10.2
  [43287f4e] PtrArrays v1.2.0
  [4b34888f] QOI v1.0.0
  [1fd47b50] QuadGK v2.9.4
  [b3c3ace0] RangeArrays v0.3.2
  [c84ed2f1] Ratios v0.4.5
  [3cdcf5f2] RecipesBase v1.3.4
  [01d81517] RecipesPipeline v0.6.12
  [189a3867] Reexport v1.2.2
  [05181044] RelocatableFolders v1.0.1
  [ae029012] Requires v1.3.0
  [79098fc4] Rmath v0.7.1
  [5eaf0fd0] RoundingEmulator v0.2.1
  [fdea26ae] SIMD v3.5.0
  [6c6a2e73] Scratch v1.2.1
  [e8f3a9d7] SearchSortedNearest v0.1.1
  [91c51154] SentinelArrays v1.4.5
  [efcf1570] Setfield v1.1.1
  [65257c39] ShaderAbstractions v0.4.1
  [992d4aef] Showoff v1.0.3
  [73760f76] SignedDistanceFields v0.4.0
  [777ac1f9] SimpleBufferStream v1.1.0
  [699a6c99] SimpleTraits v0.9.4
  [45858cf5] Sixel v0.1.3
  [a2af1166] SortingAlgorithms v1.2.1
  [276daf66] SpecialFunctions v2.4.0
  [cae243ae] StackViews v0.1.1
  [90137ffa] StaticArrays v1.9.7
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [4c63d2b9] StatsFuns v1.3.1
  [b5087856] StrFormat v1.0.1
  [68059f60] StrLiterals v1.1.0
  [892a3eda] StringManipulation v0.3.4
  [09ab397b] StructArrays v0.6.18
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [62fd8b95] TensorCore v0.1.1
  [731e570b] TiffImages v0.10.0
  [3bb67fe8] TranscodingStreams v0.11.1
  [981d1d27] TriplotBase v0.1.0
  [5c2747f8] URIs v1.5.1
  [1cfade01] UnicodeFun v0.4.1
  [1986cc42] Unitful v1.21.0
  [45397f5d] UnitfulLatexify v1.6.4
  [41fe7b60] Unzip v0.2.0
  [ea10d353] WeakRefStrings v1.4.2
  [efce3f68] WoodburyMatrices v1.0.0
  [76eceee3] WorkerUtilities v1.6.1
  [fdbf4ff8] XLSX v0.10.1
  [a5390f91] ZipFile v0.10.1
  [6e34b625] Bzip2_jll v1.0.8+1
  [4e9b3aee] CRlibm_jll v1.0.1+0
  [83423d85] Cairo_jll v1.18.0+2
  [5ae413db] EarCut_jll v2.2.4+0
  [2702e6a9] EpollShim_jll v0.0.20230411+0
  [2e619515] Expat_jll v2.6.2+0
⌅ [b22a6f82] FFMPEG_jll v4.4.2+2
  [f5851436] FFTW_jll v3.3.10+0
  [a3f928ae] Fontconfig_jll v2.13.96+0
  [d7e528f0] FreeType2_jll v2.13.2+0
  [559328eb] FriBidi_jll v1.0.14+0
  [0656b61e] GLFW_jll v3.4.0+0
⌅ [d2c73de3] GR_jll v0.73.5+0
  [a660ed4b] GeographicLib_jll v1.52.0+0
  [78b55507] Gettext_jll v0.21.0+0
  [7746bdde] Glib_jll v2.80.2+0
  [3b182d85] Graphite2_jll v1.3.14+0
⌅ [0234f1f7] HDF5_jll v1.12.2+2
  [2e76f6c2] HarfBuzz_jll v2.8.1+1
  [905a6f67] Imath_jll v3.1.11+0
  [1d5cc7b8] IntelOpenMP_jll v2024.2.0+0
  [aacddb02] JpegTurbo_jll v3.0.3+0
  [c1c5ebd0] LAME_jll v3.100.2+0
⌅ [88015f11] LERC_jll v3.0.0+1
  [1d63c593] LLVMOpenMP_jll v15.0.7+0
  [dd4b983a] LZO_jll v2.10.2+0
⌅ [e9f186c6] Libffi_jll v3.2.2+1
  [d4300ac3] Libgcrypt_jll v1.8.11+0
  [7e76a0d4] Libglvnd_jll v1.6.0+0
  [7add5ba3] Libgpg_error_jll v1.49.0+0
  [94ce4f54] Libiconv_jll v1.17.0+0
  [4b2f31a3] Libmount_jll v2.40.1+0
⌅ [89763e89] Libtiff_jll v4.5.1+1
  [38a345b3] Libuuid_jll v2.40.1+0
  [856f044c] MKL_jll v2024.2.0+0
⌃ [7243133f] NetCDF_jll v400.902.5+1
  [e7412a2a] Ogg_jll v1.3.5+1
  [18a262bb] OpenEXR_jll v3.2.4+0
⌅ [458c3c95] OpenSSL_jll v1.1.23+0
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [91d4177d] Opus_jll v1.3.2+0
  [36c8627f] Pango_jll v1.52.2+0
  [30392449] Pixman_jll v0.43.4+0
⌅ [c0090381] Qt6Base_jll v6.5.2+2
  [f50d1b31] Rmath_jll v0.4.2+0
  [a44049a8] Vulkan_Loader_jll v1.3.243+0
  [a2964d1f] Wayland_jll v1.21.0+1
  [2381bf8a] Wayland_protocols_jll v1.31.0+0
  [02c8fc9c] XML2_jll v2.13.1+0
  [aed1982a] XSLT_jll v1.1.41+0
  [ffd25f8a] XZ_jll v5.4.6+0
  [f67eecfb] Xorg_libICE_jll v1.1.1+0
  [c834827a] Xorg_libSM_jll v1.2.4+0
  [4f6342f7] Xorg_libX11_jll v1.8.6+0
  [0c0b7dd1] Xorg_libXau_jll v1.0.11+0
  [935fb764] Xorg_libXcursor_jll v1.2.0+4
  [a3789734] Xorg_libXdmcp_jll v1.1.4+0
  [1082639a] Xorg_libXext_jll v1.3.6+0
  [d091e8ba] Xorg_libXfixes_jll v5.0.3+4
  [a51aa0fd] Xorg_libXi_jll v1.7.10+4
  [d1454406] Xorg_libXinerama_jll v1.1.4+4
  [ec84b674] Xorg_libXrandr_jll v1.5.2+4
  [ea2f1a96] Xorg_libXrender_jll v0.9.11+0
  [14d82f49] Xorg_libpthread_stubs_jll v0.1.1+0
  [c7cfdc94] Xorg_libxcb_jll v1.17.0+0
  [cc61e674] Xorg_libxkbfile_jll v1.1.2+0
  [e920d4aa] Xorg_xcb_util_cursor_jll v0.1.4+0
  [12413925] Xorg_xcb_util_image_jll v0.4.0+1
  [2def613f] Xorg_xcb_util_jll v0.4.0+1
  [975044d2] Xorg_xcb_util_keysyms_jll v0.4.0+1
  [0d47668e] Xorg_xcb_util_renderutil_jll v0.3.9+1
  [c22f9ab0] Xorg_xcb_util_wm_jll v0.4.1+1
  [35661453] Xorg_xkbcomp_jll v1.4.6+0
  [33bec58e] Xorg_xkeyboard_config_jll v2.39.0+0
  [c5fb5394] Xorg_xtrans_jll v1.5.0+0
  [3161d3a3] Zstd_jll v1.5.6+0
  [35ca27e7] eudev_jll v3.2.9+0
⌅ [214eeab7] fzf_jll v0.43.0+0
  [1a1c6b14] gperf_jll v3.1.1+0
  [9a68df92] isoband_jll v0.2.3+0
  [a4ae2306] libaom_jll v3.9.0+0
  [0ac62f75] libass_jll v0.15.1+0
  [2db6ffa8] libevdev_jll v1.11.0+0
  [f638f0a6] libfdk_aac_jll v2.0.2+0
  [36db933b] libinput_jll v1.18.0+0
  [b53b4c65] libpng_jll v1.6.43+1
  [075b6546] libsixel_jll v1.10.3+0
  [f27f6e37] libvorbis_jll v1.3.7+2
  [009596ad] mtdev_jll v1.1.6+0
  [1317d2d5] oneTBB_jll v2021.12.0+0
  [1270edf5] x264_jll v2021.5.5+0
  [dfaa095f] x265_jll v3.5.0+0
  [d8fb68d0] xkbcommon_jll v1.4.1+1
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [8bf52ea8] CRC32c
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [1a1011a3] SharedArrays
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [4607b0f0] SuiteSparse
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [efcefdf7] PCRE2_jll v10.42.0+1
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.8.0+1
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`

The following is the error:

$ julia try-ncdatasets.jl
Assertion failed: (dsid > 0), function attach_dimscales, file nc4hdf.c, line 1417.

[28173] signal (6): Abort trap: 6
in expression starting at /Users/furue/Dropbox/work/trenches/try-ncdatasets.jl:6
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 9704284 (Pool: 9696768; Big: 7516); GC: 14
fish: Job 1, 'julia try-ncdatasets.jl' terminated by signal SIGABRT (Abort)
$

tmp-in.nc.zip

@Alexander-Barth
Copy link
Owner

I can reproduce this issue on Linux using your script and file (with NetCDF_jll v400.902.211+0 and v400.902.211+1):

julia> include("/home/abarth/Downloads/test_assert_nc.jl");
Precompiling NCDatasets
   15 dependencies successfully precompiled in 16 seconds. 29 already precompiled.
julia: nc4hdf.c:1469: attach_dimscales: Zusicherung »dsid > 0« nicht erfüllt.

[5427] signal (6.-6): Abgebrochen
in expression starting at /home/abarth/Downloads/test_assert_nc.jl:6
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7790b502871a)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
attach_dimscales at /home/abarth/.julia/artifacts/6205439632c585b34a0ae3e1fdbd87d6a4d51a81/lib/libnetcdf.so (unknown line)

How was the input file created ? Does the error persist if you recreate the input file (if this is possible for you)?
I am wondering if there is an issue with the input file which creates an out-of-bounds write in the netCDF C library which then crashes the netCDF C library in seeming unrelated part of the code.

@Alexander-Barth
Copy link
Owner

This works:

NCDataset(infnam, "r") do ds
  myvar    = ds["MYVAR"]
  tax     = ds["TIME1_10"]
  NCDataset(oufnam, "c") do ou
    defVar(ou, "tax", tax)
    defVar(ou, "myvar", myvar[:,1], ("TIME1_10",))
  end
end

In defVar(ou, "tax", tax) you actually create a dimension TIME1_10 just as it is named in the source file.

@ryofurue
Copy link
Author

Thanks for your response. After I posted my initial report here, I realized my error as you point out. So, my problem has been solved. But . . .

I let the report stand as is, because the point of my report was not to ask for your help to fix my code.

An assertion error is an internal error that should not be propagated to the user. That is my report here.

I'm sure you understand this, but just in case . . . in practice, it's hard to debug your code if the error message is not meaningful to you. In addition, an internal error sometimes mean that you have to wait for the library writer to fix it before you can use it once again.

@Alexander-Barth
Copy link
Owner

Yes, an assertion error is always a bug as far as I can tell. But in this case, it is an assertion internal to the netcdf C library which is maintained by the fine folks from unidata (https://github.com/Unidata/netcdf-c) which should be not considered as "my" (or NCDatasets') code. Do you have time to report it to unidata?

@ryofurue
Copy link
Author

Do you have time to report it to unidata?

Yes. If not now, I'm willing to do it soon. But to do so, wouldn't we need a C or Fortran program to replicate the problem? That was the reason why I didn't submit the issue to unidata in the first place. What's the call from within the NCDatasets Julia package that's causing this?

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Jul 31, 2024

Yes, a C code would be ideal, but maybe they can give additional insights even without. But it turns out that this is an known issue.

What's the call from within the NCDatasets Julia package that's causing this?

Yes, this can also be the case. But changing the dimension names (i.e. the data of the function call, not the function calls themselves) avoids the issue. NetCDF dimensions and HDF5 dimensions do not map one to one:

https://docs.unidata.ucar.edu/netcdf-c/current/interoperability_hdf5.html

As I understand it HDF5 dimension scales have always values attached to them where in netcdf you can have the dimension tax = 10 and the optionally the variable variable tax(tax) with dimension tax. So dimensions without variables associated to them need to be emulated. The example creates the variable tax with dimension TIME1_10 and then needs to create a dimension with the name tax. It is a code path that maybe has not be well tested.

Avoiding the do-blocks and using close,

using NCDatasets
infnam = "tmp-in.nc"
oufnam = "tmp-out.nc"

ds = NCDataset(infnam, "r")
myvar    = ds["MYVAR"]
tax     = ds["TIME1_10"]
ou = NCDataset(oufnam, "c")
defVar(ou, "tax", tax)
defVar(ou, "myvar", myvar[:,1], ("tax",))
close(ou)
close(ds)

This leads to and NetCDF: HDF error which might be easier to working with (but I also get an assertion error in attach_dimscales when the file is closed) .

While looking for attach_dimscales I found this issue:
Unidata/netcdf-c#1772

Short question: In a netcdf4-classic file, can a fixed-size dimension share the same name with a previously-defined variable?

Indeed the following reproduces the error without input files:

using NCDatasets
ou = NCDataset("test.nc", "c")
defDim(ou, "TIME1_10", 10)
nctax = defVar(ou, "tax", Int32,("TIME1_10",))
nctax[:] = 1:10
defDim(ou, "tax", 10)  # no errors
ncmyvar = defVar(ou, "myvar", Int32, ("tax",))
ncmyvar[:] = 1:10 # HDF error
close(ou) # assertion error

A subsequent call to close(ou) leads to the assertion error (Assertion failed: (dsid > 0), function attach_dimscales, file nc4hdf.c, line 1417).

@ryofurue
Copy link
Author

ryofurue commented Aug 5, 2024

Thank you for homing in on the cause of the error. But I can't reproduce your assertion error. I commented out your "HDF error" line:

using NCDatasets
ou = NCDataset("test.nc", "c")
defDim(ou, "TIME1_10", 10)
nctax = defVar(ou, "tax", Int32,("TIME1_10",))
nctax[:] = 1:10
defDim(ou, "tax", 10)  # no errors
ncmyvar = defVar(ou, "myvar", Int32, ("tax",))
#ncmyvar[:] = 1:10 # HDF error
close(ou) # assertion error

I still got "NetCDF: HDF error (NetCDF error code: -101)".

How did my original code reach the assertion error without stopping at the HDF error?

If the netCDF library reports the HDF error and stops there, then that's not a bug in the library . . . . The netCDF-library developers will need to be shown a case of assertion error . . .

@ryofurue
Copy link
Author

ryofurue commented Aug 5, 2024

I've found the answer to my question

How did my original code reach the assertion error without stopping at the HDF error?

I got the HDF error on REPL of julia. Then, when I quitted the Julia interpreter, I got the assertion error.

So, I guess that the recipe is to get the HDF error first but to proceed to close the netCDF file, ignoring the error.

@ryofurue
Copy link
Author

ryofurue commented Aug 5, 2024

I've translated your Julia code into Fortran as faithfully as possible, but then the error disappeared! The generated netCDF file was a correct one! See below.

So, we need to know the sequence of netCDF-library calls which the Julia code generates.

But then, am I using the right netCDF version? With fortran, I use netCDF 4.6.1.


In the below translation, one difference is that you cannot mix defDim and tax[:] = 1:10 if you use the original netCDF library. After everything is defined, you have to call nc_enddef before writing values to the variables.

program try_netcdf_error
  use netcdf
  implicit NONE
  integer:: ncid, i, dimid_time, varid_tax, dimid_tax, varid_myvar
  integer, parameter:: tax(10) = [(i,i=1,10)]

  !-- ou = NCDataset("test.nc", "c")
  i = nf90_create("test.nc", cmode = NF90_CLOBBER, ncid=ncid)

  !-- defDim(ou, "TIME1_10", 10)
  i = nf90_def_dim(ncid, name="TIME1_10", len=10, dimid=dimid_time)

  !-- nctax = defVar(ou, "tax", Int32,("TIME1_10",))
  i = nf90_def_var(ncid, name="tax", xtype=NF90_INT, &
      dimids=dimid_time, varid=varid_tax)

  !-- nctax[:] = 1:10 . . . Impossible before nf90_enddef().
  !! i = nf90_put_var(ncid, varid=varid_tax, values=tax)

  !-- defDim(ou, "tax", 10)  # no errors
  i = nf90_def_dim(ncid, name="tax", len=10, dimid=dimid_tax)

  !-- ncmyvar = defVar(ou, "myvar", Int32, ("tax",))
  i = nf90_def_var(ncid, name="myvar", xtype=NF90_INT, &
      dimids=dimid_tax, varid=varid_myvar)

  i = nf90_enddef(ncid) !-- End the def section.

  !-- nctax[:] = 1:10
  i = nf90_put_var(ncid, varid=varid_tax, values=tax)
  if (i/=NF90_NOERR) write(*,*) trim(nf90_strerror(i))

  !-- ncmyvar[:] = 1:10 # HDF error
  i = nf90_put_var(ncid, varid=varid_myvar, values=tax)
  if (i/=NF90_NOERR) write(*,*) trim(nf90_strerror(i))

  !-- close(ou) # assertion error
  i = nf90_close(ncid)
end program try_netcdf_error

On the command line, I compile this with

$ gfortran -I/opt/homebrew/include  try-netcdf-error.f90 -L/opt/homebrew/lib -lnetcdff -lnetcdf

as my netCDF library is installed under /opt/homebrew/.

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Aug 6, 2024

Actually the error is HDF5 specific using cmode = nf90_netcdf4 (which is default for NCDataset but not for Fortran).
I cannot get the Fortran version to crash either on ubuntu with netcdf-fortran 4.5.4 and netcdf-c 4.8.1.
Maybe the error is also specific to the actuall version NetCDF.

Normally the HDF5 error stop the normal program flow preventing to reach the nc_close function except in a do block where the error is caught and the file is closed and then raised again. The garbage collector also closes the netCDF file the variables goes out-of-scope (garbage collector is called on all objects in your session when you close julia).

@ryofurue
Copy link
Author

ryofurue commented Aug 6, 2024

I think your conjecture on the cause of the error is very accurate. I've converted your Julia program into a minimal C++ and found that 1) if def_dim( . . . "tax" . . . ) is before def_var(. . . "tax" . . . ), there is no error; but 2) if def_dim is after def_var, "HDF error" occurs when trying to write into the variable "tax".

I guess that the root cause of the problem is that the following conflict is not gracefully handled within the netCDF library:

  1. The variable "tax" has a dimension "TIME".
  2. The dimension "tax" and the variable "tax" has the same name, which means "tax" is a dimension variable.

I think that the netCDF library should issue a proper warning or error when this conflict happens.

Perhaps I'll submit this as a bug report to unidata.

// try_netcdf_error.cc
#include <iostream>
#include "netcdf.h"

int main() {
  int i, ncid, dimid_time, varid_tax, dimid_tax;
  const int tax[] = {1,2,3,4,5,6,7,8,9,10};

  i = nc_create("test.nc", NC_NETCDF4, &ncid);
  i = nc_def_dim(ncid, "TIME", 10, &dimid_time);
  //i = nc_def_dim(ncid, "tax", 7, &dimid_tax); //(1) -> fine.
  i = nc_def_var(ncid, "tax", NC_INT, 1, (int[1]){dimid_time}, &varid_tax);
  i = nc_def_dim(ncid, "tax", 7, &dimid_tax); //(2) -> HDF error on put_var
  i = nc_enddef(ncid);

  i = nc_put_var(ncid, varid_tax, tax);
  if (i != NC_NOERR) {std::cerr << nc_strerror(i) << std::endl;}

  i = nc_close(ncid);
}

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Aug 6, 2024

In oder to use the same NetCDF libraries as julia (including all dependencies), I executed a sample C program within the Binary Builder environment and I can reproduce the core dump:

sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.2 # gcc -o nc_error nc_error.c $(nc-config --cflags --libs)
sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.2 # ./nc_error 
Could not put myvar.
before calling close
nc_error: nc4hdf.c:1469: attach_dimscales: Assertion `dsid > 0' failed.
Aborted (core dumped)

The C code is:

#include <stdio.h>
#include <string.h>
#include <netcdf.h>
int main() {
  int ncid, var_time_id, dim2_id, dim_id, var_myvar_id, status;
  int data[3] = {1,2,3};
  status = nc_create("test.nc", NC_CLOBBER|NC_NETCDF4, &ncid);
  if(status != NC_NOERR) {
    printf("Could not create the file.\n");
  }
  status = nc_def_dim(ncid, "time2", 3, &dim2_id);
  if(status != NC_NOERR) {
    printf("Could not create time2.\n");
  }  
  status = nc_def_var(ncid, "time", NC_INT, 1, &dim2_id, &var_time_id);
  if(status != NC_NOERR) {
    printf("Could not create variable time.\n");
  }
  status = nc_put_var_int(ncid, var_time_id, data);
  if(status != NC_NOERR) {
    printf("Could not put var.\n");
  }
  status = nc_def_dim(ncid, "time", 3, &dim_id);
  if(status != NC_NOERR) {
    printf("Could not create dim time.\n");
  }     
  status = nc_def_var(ncid, "myvar", NC_INT, 1, &dim_id, &var_myvar_id);
  if(status != NC_NOERR) {
    printf("Could not create myvar.\n");
  }
  status = nc_put_var_int(ncid, var_myvar_id, data);
  if(status != NC_NOERR) {
    printf("Could not put myvar.\n");
  }    
  printf("before calling close\n");
  status = nc_close(ncid);
  printf("error code after close = %d\n", status);
  return 0;
}

However, the core dump is not triggered using all the libraries (netcdf and dependencies) from Linux (ubuntu). The error is also not triggered when I recompile netCDF version 4.9.2 compiled from source (but still keeping HDF5 1.10.7 from ubuntu).
In my tests, the core dump occurs only when using the libraries from the Julia registries which are much newer than on my ubuntu 22.04 (for example HDF5 is at the version v1.14.3 in julia's BinaryBilder).

all dependencies for reference
  [692b3bcd] + JLLWrappers v1.5.0
⌃ [3da0fdf6] + MPIPreferences v0.1.0
  [21216c6a] + Preferences v1.4.3
  [0b7ba130] + Blosc_jll v1.21.5+0
  [6e34b625] + Bzip2_jll v1.0.8+1
  [0951126a] + GnuTLS_jll v3.8.4+0
  [0234f1f7] + HDF5_jll v1.14.3+3
  [e33a78d0] + Hwloc_jll v2.11.1+0
  [5ced341a] + Lz4_jll v1.10.0+0
  [7cb0a576] + MPICH_jll v4.2.2+0
⌃ [f1f71cc9] + MPItrampoline_jll v5.3.1+1
  [9237b28f] + MicrosoftMPI_jll v10.1.4+2
  [458c3c95] + OpenSSL_jll v3.0.14+0
  [c2071276] + P11Kit_jll v0.24.1+0
  [02c8fc9c] + XML2_jll v2.13.1+0
  [ffd25f8a] + XZ_jll v5.4.6+0
  [3161d3a3] + Zstd_jll v1.5.6+0
  [477f73a3] + libaec_jll v1.1.2+0
  [337d8026] + libzip_jll v1.10.1+0
  [0dad84c5] + ArgTools v1.1.1
  [56f22d72] + Artifacts v1.3.0
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching
  [b77e0a4c] + InteractiveUtils
  [4af54fe1] + LazyArtifacts v1.3.0
  [b27032c2] + LibCURL v0.6.4
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.9.2
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
→ [e66e0078] + CompilerSupportLibraries_jll v1.1.1+0
  [781609d7] + GMP_jll v6.3.0+0
  [deac9b47] + LibCURL_jll v7.73.0+8
  [e37daf67] + LibGit2_jll v1.2.3+1
  [29816b5a] + LibSSH2_jll v1.11.1+0
  [c8ffd9c3] + MbedTLS_jll v2.24.0+4
  [14a3606d] + MozillaCACerts_jll v2024.7.2+0
  [83775a58] + Zlib_jll v1.3.1+0
sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.2 # gcc --version
x86_64-linux-gnu-gcc (GCC) 5.2.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.2 # cat /etc/issue 
Welcome to Alpine Linux 3.15
Kernel \r on an \m (\l)

# configure options
./configure --prefix=/workspace/destdir --build=x86_64-linux-musl --host=x86_64-linux-gnu --enable-shared --disable-static --disable-dap-remote-tests --disable-plugins

My understanding is that the NetCDF developers know already about the "HDF5 error". But I think that the more problematic assertion error on nc_close is new.

@Alexander-Barth
Copy link
Owner

I can report the assertion error to the NetCDF developers as a new issue unless you want to do it :-)

@ryofurue
Copy link
Author

ryofurue commented Aug 7, 2024

I can report the assertion error to the NetCDF developers as a new issue unless you want to do it :-)

It would be wonderful if you do it! Your C code would be much more helpful to the netCDF devs than mine! Thank you.

I'll read your report and their responses later. If necessary, then, I may raise the issue of using the same name between a variable and a dimension. Even when the assertion error isn't triggered (it isn't in my C++ code), the use of the same name shouldn't result in error.

@ryofurue
Copy link
Author

ryofurue commented Aug 7, 2024

Wow, your report is super comprehensive!! Thank you! I have nothing to add.

@Alexander-Barth
Copy link
Owner

The upstream PR has been merged:
Unidata/netcdf-c#2968

However, it will take some times before it is available to julia users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants