This repository provides an addon to use @Spidergram within DDEV.
Spidergram is a customizable toolkit for crawling and analyzing complicated web properties. While it can be used to crawl any website, we (the folks at Autogram) designed it specifically for “ten websites in a trench coat” scenarios where a web property encompasses multiple CMSs, multiple domains, and multiple design systems, maintained by multiple teams.
</blockquote>
Create a new directory and move into it. For simplicity reasons I am using the name spidercrawl
across this readme. You are able to choose any other name here instead.
mkdir spidercrawl && cd spidercrawl
ddev config --auto
In case you are running DDEV on MacOS or Windows it is highly recommended to enable Mutagen with the following additional configuration step.
ddev config --mutagen-enabled=true
On Linux, Windows, WSL2 and Gitpod that step is not necessary.
ddev-spidergram
-addon.ddev get rpkoller/ddev-spidergram
ddev start
ddev spidergram status
The resulting output should look like that:
$> ddev spidergram status
SPIDERGRAM CONFIG
Config file: /var/www/html/spidergram.config.yaml
ARANGODB
Status: online
URL: https://spidercrawl.ddev.site:8529
Database: db
ddev spidergram go https://ddev.com
spidergram.config.yaml
file.ddev spidergram status
. The port :8529
is appended to the project’s URL (https://spidercrawl.ddev.site:8529
)._SYSTEM
in the upper right corner of the screen. In the select form on the next screen you have to click _system
again and then choose the option db
and confirm.ddev arangodump
ddev delete spidergram --omit-snapshot
The database is saved in .ddev/arangodb-backup
. After the successful dump ddev delete spidergram --omit-snapshot
deletes the project’s containers, images and volumes. The project files as well as the DDEV config files in .ddev
, including the ArangoDB database dump, remain untouched. That saves disk space and enables you to re-add the project at a later point as described in the second step.
ddev config
ddev start
ddev arangorestore
That way you re-register the existing project in DDEV, start it up and restore the database you have previously used in ArangeoDB.
arangodump
not to have a final backup before you delete your project but save one or more backups in your daily usage it has to be noted that with the current implementation it is not possible. By running arangodump
the previous dump gets overwritten! Providing a more flexible and convinient solution is planned for the future.docker-compose.arangodb.yaml
) for ArangoDB. The Spidegram database and password are set to db
to be in line with DDEV’s standards. The only difference is that the default username was left at root
since it wasn’t changeable in ArangoDB. The ArangoDB container was set to not require any authentication, which is in line with the Spidergram docker-compose file.Dockerfile.spidergram
) to the web-build folder. It runs a npm install --global spidergram
, npx playwright install
, and a npx playwright install-deps
when the addon is installed.spidergram
web command. For example you only have to type ddev spidergram status
instead of ddev exec spidergram status
.spidergram.config.yaml
to the project root. The Yaml file with that exact file name is mandatory for Spidergram to run.config.ddev-spidergram.yaml
file ensures that Node.js is set to version 18. In a post-start
-hook it is also taken care that the URL set in spidergram.config.yaml
is in line with the overall project settings. The project name, based on $DDEV_PROJECT
, and the TLD, based on $DDEV_TLD
, is getting replaced by a regex statement on every start. That way, if the project name or the TLD changes at a later point, Spidergram still just keeps working.arangodump
web command. The database dump is written to a fixed destination .ddev/arangodb-backup/
. Currently arangodump
is intended to be used to backup the database before a project is getting removed from DDEV.arangorestore
web command. Make sure that your folder with the database backup is available at .ddev/arangodb-backup/
within your project folder before you run ddev config && ddev start
..ddev/arangodb-backup/
directory is created with the -p
option in a post_install_action
and a .gitignore
file is being added to the directory excluding everything within.spidergram.config.yaml
. At the moment I am only using the default values from an old template found at https://github.com/autogram-is/create-spidergram/tree/main/templates.Any feedback in regard to bugs and potential improvements is welcome.
Contributed and maintained by @rpkoller based on the original ddev-addon-template