This post appeared originally in our sysadvent series and has been moved here following the discontinuation of the sysadvent microsite
As mentioned in a previous blog entry, this site is deployed to an S3 website bucket when the Git master branch receives a push. I will here explain how we created and configured the website bucket in question, as well as explain the varnish configuration in front of it.
The S3 storage we use is Ceph with a S3-compatible Ceph Object Gateway (radosgw) interface, but the process should work for any S3 compatible storage with website-bucket functionality.
In this post the “variables” $s3host
and $s3web
are used. These refer to the
host name of the S3 API host_base
(e.g.
“data.example.com”) and the website endpoint domain name (e.g.
“data-website.example.com”) respectively.
Creating the website bucket
Creating a website bucket is quite easy using the s3cmd
tool. First
make the regular bucket, and configure it as a website bucket:
s3cmd mb s3://rl_web.sysadvent.prod
s3cmd ws-create --ws-error=/sysadvent/404.html s3://rl_web.sysadvent.prod
We didn’t need to specify the indexfile, since the default (index.html
) gives us what
we need.
The result is a functioning website at
https://rl\_web.sysadvent.prod.$s3web
(replace $s3web with the bucket-website
frontend), with any content in the bucket.
Uploading to the bucket
The previous gitlab-ci post listed the command for uploading content to the bucket (replace $s3host with the S3 API host name):
s3cmd --no-mime-magic --access_key=$ACCESS_KEY --secret_key=$SECRET_KEY \
--acl-public --delete-removed --delete-after --no-ssl --host=$s3host \
--host-bucket="%(bucket)s.$s3host" \
sync sysadvent s3://rl_web.sysadvent.prod
The given “keys” variables are used for accessing the bucket. We found we had to add
--no-mime-magic
, to turn off magic based mime file-type detecting (as opposed
to filename suffix based) which caused the CSS files to become text/plain
instead of text/css
. The --delete-*
options ensure that we don’t end up with
cruft in the website bucket when we delete files in the repositories.
Presentation
At this point we have a fully functional website in a bucket, but we want the site to be a sub-directory of our main site. The main site already has a varnish frontend for caching purposes, so we want to use this frontend also for the sysadvent site. This implies that we want to define the bucket-website as a new backend in our Varnish configuration.
The vcl_recv is pretty self explanatory if you’re familiar with varnish:
sub vcl_recv {
if (req.url ~ "^/sysadvent($|/)") {
# bucket-website requires trailing / to get index.html
if (req.url == "/sysadvent") {
set req.url = "/sysadvent/";
}
# required to get to the correct bucket-website
# (replace $s3web with your bucket-website frontend)
set req.http.host = "rl_web.sysadvent.prod.$s3web";
# sysadvent is a normal hash director with all the
# rl_web.sysadvent.prod.$s3web frontends as backends
set req.backend_hint = sysadvent.backend(req.http.X-Forwarded-For);
}
}
We had a problem with requests for non-existing URLs returning 403. This is probably an issue in the software we use (Ceph/RadosGW) or our configuration of it, but in any case this easy to work around with Varnish:
sub vcl_backend_response {
if (bereq.url ~ "^/sysadvent/") {
# Backend returns 403 on missing file. Point it to the custom 404 page
# instad.
if(beresp.status == 403) {
set bereq.url = "/sysadvent/404.html";
return(retry);
}
# 404 is returned as a 200, because of the above config. Fix it.
if(bereq.url == "/sysadvent/404.html") {
set beresp.status = 404;
}
# Remove S3 headers
unset beresp.http.X-Amz-Meta-S3cmd-Attrs;
unset beresp.http.X-Amz-Request-Id;
}
# We set a lot of TTLs on a global scope (not just for # /sysadvent), e.g.
if (!(beresp.http.Cache-Control) || beresp.http.Cache-Control !~ "max-age") {
set beresp.ttl = 15m;
}
}
The bucket-website does not return any Cache-Control
header, so we add our
own. It does however add an Etag
, which is quite nice.
Result
You’re looking at the result of the above process – this blog.
The irony of insecure security software
It can probably be understood from my previous blog post that if it was up to me, I’d avoid products like CrowdStrike - but every now and then I still have to install something like that. It’s not the idea of “security software” per se that I’m against, it’s the actual implementation of many of those products. This post lists up some properties that should be fulfilled for me to happy to install such a product.