I love the website Cartesius, which has a trove of old topographic maps of Belgium.
One of the maps that I consult the most is a stitch of all topographic maps of Belgium of 1969, zoomable and pannable like openstreetmaps.
However, lately, the site has been acting strangely. The front portal continuously reloads, so it is hard to select anything. This has been going on for months and has not been fixed — so I guess it is no longer actively maintained, with bears the risk it might go offline soon.
Therefore, I wanted to download these maps for backup, if possible.
Poking around in the Firefox inspector, I found that the map is built up from tiles. These have a url structure of http://www.ngi.be/tiles/arcgis/rest/services/seamless_carto__default__3857__1100/MapServer/tile/16/22109/33574
where the two last parts are respectively the y and x coordinates of this tile (y going up further south, x going up further west).
I made a small trial script to download a 57*101 block of tiles:
for i in `seq 22059 22115`
do for j in `seq 33527 33627`
do curl http://www.ngi.be/tiles/arcgis/rest/services/seamless_carto__default__3857__1100/MapServer/tile/16/$i/$j -o "BasisTiles/$i-$j.jpg"
sleep 1 # so I don't get rate limited
done
done
This worked great! The hardest was to find out what exactly I had downloaded. I wanted to make a composite image, but I ran into size limits, even though the maximum dimension should be 25856 pixels and JPG should allow for 64kx64k pixels. I gave up on this, concentrating on downloading the tiles and see what to do later. As long as it exists, the website is more convenient than separate large image sheets.
Downloading the images
I made a screenshot of the entire, zoomed out, map of Belgium and put that in inkscape.
I created rectangles to download, noting the border coordinates on this image.
I just trial-and-errored, changing the x and y coordinates, to see where the edge of the downloadable area was. If I went too far outside the map, I got an error message.
I found the map is made up of different map sheets, that are all about 25×41 tiles. So I could quickly guesstimate where the map borders were and just in- or decrease the coordinates of a single tile to find the edges. Some projection is used, as not all map edges are straight, and over longer runs can have quite a curve. See for example the top of the “15” block (above), which has the y coordinate of 21063 on both edges but 21060 in the centre.
I made some unfortunate choices in the edges for my first blocks (1 to 5) so I had to download a 3 to 7 pixel high section at the eastern and western end (blocks 7A and 8A) and had some overlap in blocks because of overlaps (17 and 18) and because of errors (3/4 and 7). But other than that, the download seemed to run fine.
I downloaded each block from a script that had just the blocks like above stacked under each other, except a little more sophisticated:
# 23 Noord-Antwerpen <Block number and approximate location
export block=P23
# 82x169 = 13858 Tiles
mkdir BasisTiles/$block
for i in `seq 21781 21859`
do for j in `seq 33552 33720`
do
echo "downloading $block $i-$j"
curl http://www.ngi.be/tiles/arcgis/rest/services/seamless_carto__default__3857__1100/MapServer/tile/16/$i/$j -o "BasisTiles/$block/$i-$j.jpg"
sleep 1
done
done
In each block I calculated how many blocks there were, so I could compare this to the number of downloaded files.
As a quick estimate showed this download to take about 70 hours, I moved this script to one of my raspberry Pi’s, so I could shut down my power-slurping computer. Login via SSH and
./Downloadmaps.sh &>>logfile.txt &
disown
On the first line, the “&>>
” makes all messages (stdout and stderr) go to the logfile (and append to it). The “&
” at the end lets the program run in the background.
Even though a program runs in the background, it still has a parent process (the logged in session). So if you log out of SSH, the program quits. The “disown” prevents that by removing the job’s parent process – a trick I learnt from hackaday’s excellent Linux-fu series, that covers topics that usually go over my head but are good to know about.
This script happily ran along — well actually, I started running the script while I had not found out about all the map edges, so I edited the script while I was running it, that sometimes led to strange error messages. I just commented out everything that had already been downloaded and then restarted the script.
Quality control
Because I was a little lazy around the edges sometimes (especially where they were not straight), this process would sometimes encounter an HTML page with error message (for a tile not found) and happily download it as a JPG file. Ordering the files by size I noticed something different. Many files were between 8 and 16kB, but some as big as 130kB and those always had a transparent part. Opening one of these files with a text editor, I saw they started with “PNG”, indicating a PNG image instead of a JPG:
This is not a problem (I can always convert them to JPG if needed) but I wanted some quality control, to see what I had downloaded. I wanted a “map of maps” where each tile is one pixel, and coloured according to the contents. As I was working in Bash shell script already, I did this as well.
#!/bin/bash
### Script to make an overview image what tiles are of what format
### any pixel on this file that is
### BLACK = tile not downloaded
### WHITE = PNG tile (with transparency)
### GREEN = JPG tile (what I want)
### RED = HTML file (with error document, tile did not exist, I downloaded anyway
### BLUE = Something else happened (I don't want this!)
#Map empty ./Merge/Links
echo "Clean up Merge/Links"
rm -fr Merge/Links
mkdir Merge/Links
#link alle files to 1 big directory
echo "creating Links in ./Merge/Links/ (can take a minute"
for i in BasisTiles/*
do
echo "$i"
for j in $i/*
do
# echo "creating link to $j"
ln -sr $j Merge/Links/
done
done
#make overview, 1 pixel per tile
echo "Creating Overview Pixels"
rm -fr Merge/OverviewPixels
rm Merge/Overview.png
mkdir Merge/OverviewPixels
mkdir Merge/OverviewPixels/Rows
for i in `seq 21690 22380`
do
for j in `seq 33200 33950`
do
if test -f "Merge/Links/$i-$j.jpg" #Does the tiel exist?
then #file exists
fileType=`file -Lb "Merge/Links/$i-$j.jpg" | cut -c1-3` #first 3 caracters
if [ $fileType = "JPE" ] #IF JPG
then
ln -sr "Admin/1x1Green.png" "Merge/OverviewPixels/$i-$j.png"
elif [ $fileType = "PNG" ] #If PNG, edge tiles with transparancy
then
ln -sr "Admin/1x1White.png" "Merge/OverviewPixels/$i-$j.png"
elif [ $fileType = "HTM" ] #IF HTM (error document, tile didn't exist )
then
ln -sr "Admin/1x1Red.png" "Merge/OverviewPixels/$i-$j.png"
else #other file type
echo "Unexpected filetype $fileType at $i-$j"
ln -sr "Admin/1x1Blue.png" "Merge/OverviewPixels/$i-$j.png"
fi
else #tile does not exist!
ln -sr "Admin/1x1Black.png" "Merge/OverviewPixels/$i-$j.png"
fi
done
convert +append "Merge/OverviewPixels/$i-*.png" "Merge/OverviewPixels/Rows/$i-row.png"
echo "row $i made"
done
convert -append "Merge/OverviewPixels/Rows/*.png" "Merge/Overview.png"
The first part (up to line 34) is just cleanup and admin. Because I downloaded all tiles in a directory per “block”, I made symlinks to all in one directory (Merge/links
).
From line 35 on, I loop trough all the possible tiles in a rectangle. For each file, I test if it exists (line 39).
If it does, I test the file type with the file
command, and remove all but the first 3 characters with cut
(line 41).
Then in lines 43-55 and line 57, I create a link to a 1×1 pixel PNG file, with the tile coordinates as name.
The linked pixel is black if no tile was downloaded, green for JPG, white for PNG, red for a HTM document and blue otherwise.
Then I use convert (from the imagemagick suite) to create an image with all the pixels from one row of tiles appended horizontally (see the +append
in line 60), and finally outside the loop, I create an image with all the “row” images stacked vertically (due to -append
in line 63).
Here, I could see that the PNG images (white) were not only present around the edges, but also in the seamlines between some tiles.
I also noticed that I somehow managed to not download one row (the next-to-rightmost) from block 11, the left half of block 18, and that I had some parts that were not completely up to the edge (no PNG images, no white pixels). I downloaded these with some manual scripting.
I also noticed two red pixels within the image. It took some fiddling to find out where, but they were “502 proxy error”s. I re-downloaded these manually.
Fortunately, I got no blue pixels, so no unexpected file formats.
Next steps
I now have about 255604 files, about 3.2GB.
I could should set up a tile server, but for now the Cartesius website is the most convenient way for me to see these maps. So I will probably not set up a tile server until the Cartesius site goes down.
Note to self: the main URL to Cartesius Search is now http://www.cartesius.be/geoportal/catalog/search/search.page?lang=en
Other tilesets can be downloaded with a new URL format: curl https://wmts.ngi.be/arcgis/rest/services/seamless_carto__default__3857__800/MapServer/tile/16/22101/33574