Latest posts
-
Most stale bots are anti-user and anti-contributor, but they don't have to be
If you’ve been around open source projects on GitHub you may have encountered a project with a stale bot.
-
Python version epochs are broken
In PEP440 Python introduced Version Epochs as a mechanism to allow projects to change versioning scheme. Unfortunately there’s no way I could see a project actually making use of this without confusing their users.
-
Creating GitHub Releases automatically on tags
GitHub Releases is a feature where you can create a page associated with a git tag that contains a description of the changes in that tag along with build artifacts for users to download.
-
A beginner's guide to managing Kubernetes resources in Python with kr8s
Managing Kubernetes resources with Python has never been easier thanks to the
kr8s
Kubernetes client for Python. -
Running Dask on Databricks
Databricks is a very popular data analytics platform used by data scientists, engineers, and businesses around the world. It was founded by the creators of Apache Spark, a powerful open-source data processing engine, and builds on top of Spark to provide a comprehensive analytics platform.
-
Running Dask workloads on multiple cluster backends with zero code changes using dask-ctl
Sometimes you want to write some code using Dask which can then be run against multiple different cluster backends. For example for local testing you might want to use
LocalCLuster
, but in production useKubeCluster
. Or perhaps you want to easily switch between an on premise HPC withSLURMRunner
or the cloud withCoiled
. -
EffVer: Version your code by the effort required to upgrade
Version numbers are hard to get right. Semantic Versioning (SemVer) communicates backward compatibility via version numbers which often lead to a false sense of security and broken promises. Calendar Versioning (CalVer) sits at the other extreme of communicating almost no useful information at all.
-
How to highlight lines in a Hugo code block
Sometimes when writing code in a blog post I want to emphasize a couple of lines in particular. Today I found out that Hugo has really nice syntax to do this in a regular markdown code-fence.
-
How to get typer to show help by default
I love using typer for creating CLI tools in Python. It makes creating complex trees of subcommands really straightforward.
-
GitHub streaks and work/life balance
I recently read Loving and hating the Streak by Cassidy Williams. The post was all about committing code on GitHub every single day to maintain a streak.
-
Comparison of kr8s vs other Python libraries for Kubernetes
I’ve been working on
kr8s
for a while now and one of my core goals is to build a Python library for Kubernetes that is the most simple, readable and produces the most maintainable code. It should enable folks to write dumb code when working with Kubernetes. -
The challenge of updating an aging blog
The Dask blog is a bit neglected these days. The website is an aging Jekyll blog and is well past it’s prime. Bringing it into current decade has been on my backlog for a while and today I decided to dedicate some time getting it up to date.
-
How I fixed my UniFi Devices intermittently showing as offline
Since upgrading to a UniFi Dream Machine (UDM) Pro I’ve had a problem with some of my UniFi devices showing as offline. TLDR It turned out that I accidentally had two controllers on my network and the devices were hopping back and forth between them.
-
Livestream notes: Replacing aiohttp with httpx in kr8s
This post will be updated with notes from the livestream throughout the day.Today I will be streaming some open source code refactoring. Come and join in on Twitch!. Don’t forget to say hi in the chat š.
-
Introducing kr8s, a new Kubernetes client library for Python inspired by kubectl
For the last few months I’ve been tinkering with a new Kubernetes client library for Python called kr8s.
-
Avoid indirection in tests at all costs
When writing tests the balance between avoiding indirection and DRY-ness should be much more weighted towards avoiding indirection than in the code it is testing.
-
Mini demos
May 17, 2023 4 minute read #workAs software engineers we should all be able to communicate the things we have built to others, but giving a formal demo of something you’ve been working on can be daunting. Mini demos are a great way to build muscles around giving ad-hoc demos with little to no preparation.
-
Debugging Data Science workflows at scale
May 12, 2023 15 minute read #python, #dask, #kubernetes, #apache-beam, #google-cloud, #google-kubernetes-engineThe more we scale up our workloads the more we run into bugs that only appear at scale. Reproducing these bugs can be expensive, time consuming and error prone. In order to report a bug on a GitHub repo you generally need to isolate the bug and come up with a minimal reproducer so that the maintainer can investigate. But what if a minimal reproducer requires hundreds of servers to isolate and replicate?
-
Running Jupyter in your Dask Kubernetes cluster
Did you know that the Dask scheduler has a
--jupyter
flag that will start a Jupyter server running within the Dask Dashboard? -
Being intentional with container terminology
When writing and speaking about linux container technologies I’m trying to be more intentional with the words I use, which means often avoiding the word docker. My goal is to communicate clearly to both experts and novices alike.
-
Oversubscribing GPUs in Kubernetes
Sometimes I want to oversubscribe the GPUs in my Kubernetes cluster. This is especially useful when I’m developing but could also be useful in light workloads where you have ample GPU memory and don’t mind the occasional failure.
-
Quick and dirty way to pre-pull container images on Kubernetes
Sometimes when I give live demos with Kubernetes clusters I want to make sure that the container images I’m going to use are already pulled onto all of the nodes in my cluster. The last thing I want is for a
Pod
to be created to then sit in aPending
state while an image is pulled, especially given how large containers can be in the Data Science space. -
Debugging Sphinx extensions in VSCode
This week I’ve been working on some custom Sphinx extensions for a documentation site.
Sphinx is a pretty complex tool with a broad ecosystem so documentation tends to be spread across the upstream project, dependencies like docutils and popular extensions like MyST. Therefore figuring out what is going on can be challenging, so I almost always resort to digging through state in a debugger and doing code spelunking on GitHub.
-
Sometimes I regret using CalVer
Over the last few years, many open-source Python projects that I work on have switched to CalVer. I’ve felt some pain around this, particularly in Dask and its subprojects. I want to unpack some of my thoughts and feelings around this trend.
-
Narrative driven development
In July I published a blog post on using Dask on KubeFlow with the Dask Kubernetes Operator. I originally outlined that post in January before the Dask Operator even existed as part of my planning for that work.
-
Accelerating ETL on KubeFlow with RAPIDS
Aug 30, 2022 11 minute read #dask, #etl, #kubeflow, #pandas, #rapids, #technical-walkthrough ArchiveIn the machine learning and MLOps world, GPUs are widely used to speed up model training and inference, but what about the other stages of the workflow like ETL pipelines or hyperparameter optimization?
-
How to check your NVIDIA driver and CUDA version in Kubernetes
When using GPUs with Kubernetes it can be important to know which driver and CUDA versions are installed on the nodes.
-
Using Dask on KubeFlow with the Dask Kubernetes Operator
Kubeflow is a popular Machine Learning and MLOps platform built on Kubernetes for designing and running Machine Learning pipelines for training models and providing inference services. It has a notebook service that lets you launch interactive Jupyter servers (and more) on your Kubernetes cluster as well as a pipeline service with a DSL library written in Python for designing and building repeatable workflows. It also has tools for hyperparameter tuning and running model inference servers, everything you need to build a robust ML service.
-
Don't prematurely squash/rebase and force push your PRs
A big frustration for me when reviewing Pull Requests on GitHub is coming back to a PR you’ve already reviewed to check on recent changes and be greeted with “We went looking everywhere, but couldnāt find those commits”.
-
Commenting on Pull Requests with GitHub Actions
When someone opens a Pull Request (PR) on your GitHub project it can be helpful for a bot to comment on the PR. You might want to thank the user for the contribution, provide some useful information such as giving a binder link where folks can try out the PR, or providing more verbose output from some tests or other checks.
-
The secret to making code contributions that stand the test of time
When you contribute code to collaborative projects, whether they are open source community projects or large internal projects inside organisations, the feeling of having your code running inside a large application can be very rewarding.
-
How to set environment variables on your Dask workers
When working with Dask clusters you often need the remote worker environment to match you local environment. This generally means having the same packages and data available.
-
Golang block until interrupt with ctrl+c
Today I found myself needing a Go application’s main thread to stop and wait until the user wants it to exit with a
ctrl+c
keyboard interrupt. -
Goodbye Docker Desktop for Mac, Hello Colima
Today is the deadline for the license changes to Docker Desktop for Mac and Windows. This means that if you are employed at a company with more than 250 employees or your company makes more than $10m you need to start paying a subscription to continue using Docker Desktop.
-
Docker Desktop for Mac alternatives for developers
In a couple of days Docker will begin charging employees of companies with >250 employees to use Docker Desktop. I have no problem with paying for software that brings me value, but you wouldn’t believe how complex it can be for large companies to sign employees up to subscription services. Paperwork everywhere! To avoid this I’m evaluating alternatives for Docker Desktop to use on my MacBook.
-
Running Kubeflow inside Kind with GPU support
This week I’ve been playing around with Kubeflow as part of a larger effort to make it simpler to use Dask and RAPIDS in MLOps workflows.
-
Quick hack: Adding GPU support to kind
This post has been superseded with this tutorial that no longer requires any code changes. Please read that instead.
Don't be that open-source user, don't be me
Before I was a maintainer of open source software I was a user of open source software, and I sometimes behaved badly.
Branding your open source Python package
Having a brand can help give your open source project some legitimacy, and you don’t need to be a designer to see these benefits. However it is important to understand that you do not need to add branding to your project in order for it to be successful, and adding branding can even harm your project.
What is the difference between Dask and RAPIDS?
Both Dask and RAPIDS are Python libraries to scale your workflow and empower you to process more data and leverage more compute resources. Both use interfaces modeled after the PyData ecosystem, making them familiar to most data practitioners.
The evolution of a Dask Distributed user
This week was the 2021 Dask Summit and one of the workshops that we ran covered many deployment options for Dask Distributed.
Building a contributor community for your open source project
With our open source project published on GitHub we probably want to allow folks to contribute changes. Some users of the project may find bugs, or desire extra features and will open issues to tell you. Users who have the skills required to make that change can open a Pull Request on GitHub to propose it. As the maintainer you can then review and merge those changes.
Communicating with your open source community
Once your open source Python project has users and a community you will likely want to communicate with them in an official capacity. Perhaps you want to tell them about a new release, show a use case where someone is using your tool or solicit feedback on an upcoming feature.
Building a user community for your open source project
Now that our open source Python project exists and users can install it we will want to turn our attention to sustainability, reach and ongoing maintenance. By putting it out there and gaining users you are opening yourself up to questions, bug reports and feature requests.
Documenting Python projects with Sphinx and Read the Docs
In part four of this series we discussed documenting our code as we went along by adding docstrings throughout out project. In this post we will see that effort pay off by building a documentation site using Sphinx which will leverage all of our existing docstrings.
Monitoring Dask + RAPIDS with Prometheus + Grafana
Prometheus is a popular monitoring tool within the cloud community. It has out-of-the-box integration with popular platforms including Kubernetes, Open Stack, and the major cloud vendors, and integrates with dashboarding tools like Grafana.
Automating releases of Python packages with GitHub Actions
In this post we will cover automatically packaging and releasing our project when a new git tag is pushed to GitHub.
Testing and Continuous Integration for Python packages with GitHub Actions
In this post we will cover automatically running our tests when we push new code to GitHub, and when contributors raise Pull Requests against our project.
Awaitable Objects and Async Context Managers in Python
Python objects are synchronous by default. When working with
asyncio
if we create an object the__init__
is a regular function and we cannot do any async work in here.Test driven development in Python
What is test driven development (TDD)?
Test driven development is a style of development where you write your tests before you write your code.
Testing your Python package
In this post we will cover testing our code.
Testing
There are many many great resources out there for learning about testing software. In this post I’m going to try and focus on simple examples that you can use to get started quickly. Once you have a good foundation for your tests you can then dive into mocking, replaying HTTP requests or even hypothesis testing.
Documenting your Python code
This post will cover documenting our code. Specifically adding documentation within the code itself.
Docstrings
Right now our code is undocumented, so if the user inspects our function they will only see the interface (the way you call it) but with no other context. We can use IPython to quickly inspect this.
How to interactively debug GitHub Actions with netcat
Update: This was a fun experiment and I recommend you check out the post for a fun read on setting up reverse shells. But I’ve since discovered this awesome tmate action which lets you interactively debug in the browser or via SSH.
How to check out the default git branch
Many open source projects are taking steps to update terminology to be more inclusive. The largest of these changes has been renaming the “trunk” branch of git repositories from
master
tomain
.Leveraging the Hacktoberfest community
Hacktoberfest is approaching once again. In previous years I have both participated and contributed to open source, and also tried to leverage the community in the open source projects I maintain by curating and labeling issues.
Running Dask tutorials
Aug 21, 2020 20 minute read #python, #dask, #distributed-computing, #open-source, #community, #tutorials ArchiveOriginally published on the Dask blog on August 21st, 2020.
For the last couple of months we’ve been running community tutorials every three weeks or so. The response from the community has been great and we’ve had 50-100 people at each 90 minute session.
The current state of distributed Dask clusters
Originally published on the Dask blog on July 23rd, 2020.
Dask enables you to build up a graph of the computation you want to perform and then executes it in parallel for you. This is great for making best use of your computer’s hardware. It is also great when you want to expand beyond the limits of a single machine.
How to use OBS Studio with Zoom, Hangouts, Teams and more on macOS
A popular tool with streamers and YouTubers is Open Broadcaster SoftwareĀ®ļø Studio or OBS for short. It allows you to compose scenes with cameras, desktop sharing, video snippets, images, web pages and more and then stream that video to services like Twitch or Mixer. You can also save recordings locally if you want to upload them to YouTube.
How to enable SSH on Binder
ā ļø This post is no longer valid.
Running SSH on Binder has not been possible since late 2020. Due to abuse from botnets Binder will now kill sessions running
sshd
.Publishing open source Python packages on GitHub, PyPI and Conda Forge
In this post we will cover making our code available to people. This is the bit where we open the source! We will push our code to a code posting platform and then package up our library and submit it to a couple of repositories to make it easy for people to install.
Versioning and formatting your Python code
In this post, we will cover a few project hygiene things that we may want to put into place to make our lives easier in the future.
Testing static sites with Lighthouse CI and GitHub Actions
Feb 13, 2020 7 minute read #python, #github, #tutorial, #github-actions, #static-sites, #lighthouse-ciWhen you build a website you want pages to load as quickly as possible for users. Google has a tool called PageSpeed Insights which you can run on your website to see various metrics about the page. I’ve used it in the past while working on my blog and other sites.
Creating an open source Python project from scratch
Have you had a great idea for an open-source Python library that you think people will find useful, but you don’t know where to begin in creating and publishing it?
Twenty Nineteen Roundup
Introduction
It has been a few years since I published a list of technology and media I enjoyed this year, so here we go for 2019.
5 Tips to help you ace your internship and entry-level job interviews
Applying for internships and entry-level positions can be tricky. Interviewers want to hear you talk about your experiences and things you’ve done that prove you’re a good fit for the job. However given that you are applying for an entry-level role you likely don’t have much real world experience in this space. It’s a chicken and egg situation that everyone faces when they are first starting out or wanting to make a shift to a new area.
Creating GitHub Actions in Python
Note: This post is also available in Go flavour.
GitHub Actions provide a way to automate your software development workflows on GitHub. This includes traditional CI/CD tasks on all three major operating systems such as running test suites, building applications and publishing packages. But it also includes automated greetings for new contributors, labelling pull requests based on the files changed, or even creating cron jobs to perform scheduled tasks.
Creating GitHub Actions in Go
Note: This post is also available in Python flavour.
GitHub Actions provide a way to automate your software development workflows on GitHub. This includes traditional CI/CD tasks on all three major operating systems such as running test suites, building applications and publishing packages. But it also includes automated greetings for new contributors, labelling pull requests based on the files changed, or even creating cron jobs to perform scheduled tasks.
How to run Jupyter Lab at startup on macOS
In my day to day work I generally access a variety of Jupyter installations. Sometimes these are short lived installations in conda environments on my laptop, sometimes they are running on a remote server, and sometimes I use a managed service like JupyterHub or Binder.
How to create a Helm chart repository with Chartpress, Travis CI, GitHub Pages and Jekyll
Helm has become a pervasive tool in the Kubernetes community for packaging, managing, upgrading and distributing applications. It uses a packaging format called charts which are a collection of templates that describe Kubernetes resources and can be configured by the user.
How to merge Kubernetes kubectl config files
Sometimes when working with a new Kubernetes cluster you will be given a config file to use when authenticating with the cluster. This file should be placed at
~/.kube/config
. However you may already have an existing config file at that location and you need to merge them together.The three types of fun
According to folks who enjoy outdoor activities there are three types of fun. I’ve been using this scale for a while to categorize my own enjoyment of things and wanted to share my version.
Why your profile picture is important
Choosing a good profile picture will make collaborating with others easier, especially if you haven’t met them yet. Here are some tips to help you pick a good one.
Cleaning up conda environments
Often when I’m developing or debugging in Python I end up creating throw away conda environments. They will be to test some package installation or combination of packages and once I’ve finished I will probably never use them again.
Setting up GPU Data Science Environments for Hackathons
Originally published on the RAPIDS AI blog on August 13th, 2019.
Background
In my first week working at NVIDIA, I have been spending some time with my previous colleagues at the Met Office to explore how the two organizations can collaborate.
Switching to Hugo
It has been nearly two years since I published a new blog post on this website. That doesn’t mean I haven’t been writing things. It’s just that much of my content has been posted on other platforms. I’ve decided recently to gather everything together and make this website the canonical source for the things I produce. This includes blog posts, talks, videos and more.
Hypothetical datasets
In Theo’s previous posts on storing high momentum data and its accompanying metadata we get some interesting insights into the future of cloud based data storage. In this post I’m going to cover how we are working with today’s NetCDF-based challenges, by making assumptions!
Intro to Earth Information Workshop
This article was originally written for the the Met Office workshop run at the Intro to Earth Information event on the 12th of March 2019.
My pragmatic workshop format
Jan 30, 2019 7 minute read #workshop, #conference-planning, #facilitation, #public-speaking, #training ArchiveMozfest workshop facilitators meeting Figuring out the right format for a workshop can be tricky. There are so many factors; what is the subject, do people need any equipment, how many people will attend, how many facilitators will there be, where will it be held, what level of expertise will the participants have, the list goes on…
Debugging Kubernetes PVCs
Sometimes I find that something goes wrong in a container and some data stored in a persistent volume gets corrupted. This may result in me having to get my hands dirty and have a poke around in the filesystem myself.
Using Xiaomi door/window sensors as light switches
Introduction
For a while I’ve been searching for a decent light switch solution for my home automation setup. I’ve recently put in a pretty good solution using Xiaomi door/window sensors, I’m very happy with it and it ticks a lot of boxes.
Exploring Dask and Distributed on AWS Lambda
I spent some time this week exploring whether it would be possible to run Dask and Distributed on a function as a service platform like AWS Lambda.
Instant access to auto-scaling personal Python clusters
Originally published on the Met Office Informatics Lab blog on February 7th, 2018.
We are excited to announce that the work we’ve been doing with distributed Dask clusters running on Kubernetes has been absorbed into an awesome new tool called Daskernetes through our work on the Pangeo project.
ChatOps - Automation via chat
Originally published on the Met Office Informatics Lab blog on December 19th, 2017.
ChatOps - Automation via chat
This article is a companion to a workshop on using chat to automate ops workflows. This is a static version of a Jupyter Notebook which you can download here.
Deploying opsdroid using ZEIT
ZEIT is a great platform for deploying your opsdroid instance. Particularly because it is free for light use, which many opsdroid deployments will be.
Article in Computer Weekly
Aug 1, 2017 1 minute read #eventsThis week Computer Weekly have published an article interviewing me about how the Met Office is tackling the vast amount of data we are producing. They reference the work the Informatics Lab have done on the Jade project and Met Office public datasets.
Adaptive Dask clusters on Kubernetes and AWS
Originally published on the Met Office Informatics Lab blog on July 21st, 2017.
Introduction
This article assumes a basic understanding of Amazon Web Services (AWS), Kubernetes, Docker and Dask. If you are unfamiliar with any of these you should do some preliminary research before continuing.
Generate git release notes automatically
It is common practice for release notes to consist of a list of the Pull Requests which have been merged since the last release. Some projects divide these into categories, for example breaking changes, enhancements and bug fixes. If you are a project maintainer you may want to be able to generate this automatically.
How to create a seal only token for Hashicorp Vault
Introduction
When using Hashicorp’s Vault you may want to have an authentication token which only has permissions to seal the vault. This can then be used in an emergency situation to seal the vault, perhaps through a chatbot.
RITA 2017 Innovation Award
My team won a Real IT Award for Innovation! Check out the full post here.
Monitoring scalable infrastructure
Originally published on the Met Office Informatics Lab blog on May 8th, 2017.
Recently we’ve been thinking a lot about monitoring. In a world of ephemeral servers, auto-scaling, spot instances and infrastructure-as-code, monitoring has to be tackled differently.
Using Jupyter notebooks for SysAdmin, CloudOps and DevOps workflows.
Originally published on the Met Office Informatics Lab blog on May 8th, 2017.
Jupyter notebooks are awesome. If you speak to a data scientist or analyst who writes Python there’s a very good chance that they use Jupyter notebooks. But I think there’s another community that would benefit hugely from including them in their standard arsenal of tools, and that’s folks in IT Infrastructure.
Moving large volumes of data to S3
Originally published on the Met Office Informatics Lab blog on April 20th, 2017.
We just moved ~80TB of data to S3 (stay tuned to hear what we’re doing with it).
Apple Airport Express Repair
Introduction
I recently acquired an Apple Airport Express wireless hotspot which wouldn’t power on. This was most likely down to a fault on the power supply board and so I decided to have a go at fixing it.
Building Telegraf for 32bit FreeBSD
Introduction
Currently InfluxData do not provide a 32bit FreeBSD build of Telegraf as part of their standard packages. Luckily it is easy to build yourself.
Build games for iOS 10 with Xcode 8 and Game Maker Studio 1.4
Introduction
Building games for iOS is straight forward with Game Maker. You create the game as normal in Game Maker, but in order to build it you must have a Mac with Xcode installed. You must configure Game Maker with the IP address of your Mac and the username and password. When you build the project Game Maker will produce an Xcode compatible project, copy it onto your Mac and open it in Xcode. You can find comprehensive instructions on the YoYo Games website.
A game on the perception of symbols
Originally published on the Met Office Informatics Lab blog on June 24th, 2016.
With some friends look out the window and each choose a weather symbol which represents what you see. Do you all agree?
Running Telegraf inside a docker container
Telegraf is an application for collecting server and application telemetry and metrics and sending them to a time series datastore like InfluxDB. Like me you may prefer running all of your applications in Docker containers, however this means Telegraf will only collect data for the container. This article will cover the configuration options to allow Telegraf to collect host metrics from inside a container.
Getting started with VMwares ESXi/vSphere API in Python
In 2013 VMware dropped their Python library for accessing the API for ESXi/vSphere on GitHub. This is great, however it isn’t the easiest library in the world to use. This quick guide will show you how to connect to an ESXi host or vSphere cluster and get some info about a virtual machine.
Cracking Enigma with Go
Originally published on the Met Office Informatics Lab blog on June 2nd, 2016.
Can I crack the Enigma code with Go on a MacBook? Yes!
A note on AWS disk performance testing
Here is an interesting note on testing the disk performance of your AWS instances. Before you can acurately test the performance of your EBS disk you need to read all the sectors of the disk at least once if that disk was created from a snapshot.
Interactive Docker containers
Interactive shell
In this post we are going to create a container we can interact with. We can then have a poke around inside the container and see what it is and how it works.
Running a Docker container
Installing Docker
Installing Docker on your machine is required but beyond the scope of this series. Getting Docker up and running is an ever evolving and improving process and anything put here will go stale reasonably quickly. As Docker uses linux kernel features you will need a running linux operating system. Therefore installing Docker on linux is easy, however installing it on Windows and Mac involve running a lightweight linux virtual machine.
What is Docker?
Welcome to the first post in ‘Intro to Docker’, a series I’m writing to get beginners started with Docker. Each post in this series will be no more than a 5 minute read and will cover exactly one topic surrounding Docker.
How to use an Xbox 360 controller with OS X El Capitan
Introduction
In order to use an Xbox 360 controller with OS X El Capitan you will need to install a driver for it. This is an update to my article on using an Xbox 360 controller with Yosemite.
MacBook Rebuild
Introduction
I’ve decided to wipe my MacBook and reinstall everything! The current installation has been hanging around for the last few years and has even migrated between hardware with Time Machine. Things are starting to feel a little slow, I’ve installed lots of things in the past which I don’t want any more, and I just want a fresh start. I’ve backed it up with Time Machine just in case and I’m ready to get started.
Pretty git logs with `git lg`
This is a repost of a Stack Overflow answer, mainly to preserve it for myself. Slipp Thompson posted some really nice aliases for showing branch topology in the git command line.
Fixing the SSH roaming vulnerability (CVE-2016-0777)
A vulnerability in the OpenSSH client has been discovered which means that if you SSH to a compromised server the server can steal your private key. This affects any operating system with OpenSSH client 5.4 and above, which is pretty much all flavors of linux and OS X.
Twenty Fifteen Roundup
Introduction
Here is a list of the technology and media I enjoyed this year, a follow up on last year’s post.
Thoughts on Star Wars: The Force Awakens
Spoilers ahead, you have been warned!
I thoroughly enjoyed the new Star Wars. Like I assume many people did I went to the cinema with concerns that the film would not live up to expectations. However roughly around the moment where Rylo Ken trapped the blaster shot in mid air those worries went away.
A Raspberry Pi Docker Cluster
Originally published on the Met Office Informatics Lab blog on December 12th, 2015.
Introduction
We are fortunate in the Lab to have a small stash of Raspberry Pis in our cupboard which are used at hackathons and other events. As there are no events using them currently I thought I’d take the opportunity to make a nice demonstration piece to show off clustering containers.
Building with Kubernetes
Originally published on the Met Office Informatics Lab blog on October 1st, 2015.
For our 3D visualisation project we wanted to build a data processing service using Docker containers. We quickly found that once you are running more than a couple of containers you need a way to manage them. After looking into the different tools available we decided to give Kubernetes a go, this is what we learned.
Quick Tip - git delete merged branches
Here’s a quick line to run in your terminal to delete all local git branches which have already been merged into master.
Quick Tip - em vs rem
em
andrem
are used in CSS to set a size value relative to afont-size
. This is useful in many situations such as increasing the font size relatively across your whole website by changing one value or adding padding which is larger or smaller depending on the font size.govspeak: An open source markup language
Originally published on the Met Office Informatics Lab blog on July 22nd, 2015.
The Informatics Lab website is created with an application called Jekyll. Recently I made an enhancement to it which I’m very excited about. It allows us to write our articles in a markup language called Govspeak, which is an extension to the excellent markdown.
Lab School: Docker
Originally published on the Met Office Informatics Lab blog on June 24th, 2015.
Welcome to the first ever Lab School session. This session aims to give you an overview of docker and how we are currently using it in the Lab.
Updating flightradar24 with a Raspberry Pi
You can feed data into flightradar24 (from now on referred to as fr24) simply using a Raspberry Pi and a cheap USB DVB-T tuner.
Collaborative article corrections in Jekyll
Don’t you find it really useful when you publish an article on your blog and then someone comes up to you and says:
Simple reading speed estimate in Jekyll
When browsing through off-the-shelf Jekyll themes recently I stumbled across one I really like called Pixyll. There are lots of things I like about the theme but one thing in particular is the reading speed estimate at the top of each article. Not only is it a nice feature but the code is simple and concise too!
Test your Jekyll blog with Travis CI
Introduction
Testing your blog may sound like an odd thing to do, but if you’re running a Jekyll blog hosted on GitHub it is simple to set up and really useful for notifying you about broken links and other issues.
Bullet Journaling in 2015
Introduction
It is one month into 2015 and I am on page 31 of my latest bullet journal. I started using bullet journaling to organise myself in March last year and now I can’t imagine my life without it.
How to install and configure inadyn on CentOS 6
Introduction
Inadyn is a command line utility for periodically checking and updating your ip address with DynDNS.
How to install VMware Tools on Centos 6 with yum
Introduction
Often when managing a large number of systems you want to manage all software installs the same way. So when it comes to VMware Tools you may not want to follow the official instructions but instead install using yum, especially if you’re automating a large number of headless systems.
Twenty Fourteen Roundup
Introduction
I’ve decided to finish off the year by writing a roundup of all the technology I’ve used this year. Inspiration taken from sammcj’s “the best of”.
Run OpenVPN on non-standard port with SELinux and Centos 6
I recently installed OpenVPN on a Centos 6 server but found that I couldn’t get the service to start. Running
service openvpn start
failed despite being able to runopenvpn --config /path/to/config
without errors.How to easy_install and pip through a proxy
If you’re trying to install a Python package using easy_install or pip and you connect to the internet via a proxy you’ll need to make a few changes to your setup.
How to install the vSphere 5.5 Client on Windows 8
If you’ve tried installing the vSphere 5.5 client on Windows 8 you may have received the following error message
Simple HTML Redirect
I often find myself in need of a quick html redirect page. Most of the time I use the example from Stack Overflow but it involves changing the url in 3 places.
How I value media and entertainment
I’ve been meaning to write about how I value media/entertainment and my thought process when purchasing it. This is different to my normal style of article but hopefully people will find it interesting.
How to use an Xbox 360 controller with OS X Yosemite
Update: There is a newer version of this article, see How to use an Xbox 360 controller with OS X El Capitan.
What is semalt and why are they in my analytics?
You may be among the many people having their Google Analytics stats skewed by fake referrals from semalt.
How to stop Google from scanning my site
Sometimes there may be occasions where you don’t want Google (and other search engines) to scan some or all of your website.
Amazon S3: s3cmd put ([Errno 32] Broken pipe)
Recently I decided to use Amazonās S3 as another location to store some of my server backups. However I found when testing that I was unable to upload my backup tarballs to S3. I ended up with the following errors.
How to install OS X Yosemite Developer Preview Beta in Virtualbox
Jun 7, 2014 3 minute read #apple, #developer-preview, #guide, #os-x, #terminal, #virtualbox, #yosemiteLike me you may be excited about the Developer Preview Beta of OS X 10.10 Yosemite and want to try it out, but you don’t want to deal with a buggy system between now and the general release. If that’s the case you’ll want to install Yosemite as a virtual machine on your Mac. Here’s how I’ve done it on mine using VirtualBox.
How to use text expansion in OS X 10.9 Mavericks
Since the upgrade to OS X 10.9 Mavericks and iOS 7 you may have noticed that your text expansion shortcuts from iOS have found their way onto your Mac. Thanks to iCloud all of your text shortcuts are now synchronised between your devices.
How to make screen recordings in OS X Mavericks 10.9
Did you know that you don’t need any additional software to make high quality, watermark free screen recordings in OS X Mavericks 10.9? Well this feature exists and it’s in a slightly unexpected place ā¦ QuickTime Player.
NASA Space Apps Challenge 2014 Roundup
Apr 25, 2014 4 minute read #agile, #hackathon, #javascript, #management, #nasa-space-apps, #planningA couple of weeks ago I attended a hackathon hosted by the Met Office as part of the 2014 NASA Space Apps Challenge. It was a weekend event and the basic idea was to meet with other technology professionals/enthusiasts, form teams with these like minded folk, select a challenge from a long list of challenges set by NASA and solve it during the weekend.
How to prepare for a hackathon
So this weekend Iāll be taking part in the 2014 NASA Space Apps Challenge. This time around Iāll be leading a team, rather than just joining one on the day, so I feel like I have to be extra prepared. Here is a checklist of all the things you should do before going to a hackathon/codeathon.
Python script: Recursively remove empty folders/directories
So as part of a script Iām writing I needed the ability to recursively remove empty folders/directories from a filesystem. After a bit of googling I found this very useful script by Eneko Alonso. However the script isnāt really in a usable state for what I want so I decided to make a few changes to it and publish it on GitHub.
Should I buy a cheap upgraded/reformatted SDHC micro SD card on eBay?
Short answer ā No!
Now I must admit I am one for buying rubbish on eBay and usually fancy myself as someone who can spot the difference between a bargain and a scam. However this time I almost got scammed.
Convert tweet hashtags, at-tags and urls to links with PHP and Regular Expressions
Now of course if you’re using the Twitter API you can use Twitter entities but in this tutorial we’re going to use regular expressions.
Fixing VirtualBox verr_supdrv_component_not_found when selecting bridged networking on OS X 10.9
While installing CentOS in VirtualBox (version 4.2.4) on OS X (version 10.9.1) I came across the following error message when selecting bridged networking
Google Charts IE7 IE8 Issue: Date formatting problem
Just a quick post about an issue Iāve had with Google Charts on IE7/8.
When viewing my page in Firefox or Chrome my graph displayed as expected.
Mac OS X Terminal Theme: Piperita
UPDATE ā This project is now at version 2. See the Piperita GitHub page for up to date documentation and information.
Bootcamp Windows 7 on a 2011 MacBook Pro without a SuperDrive
Sep 30, 2013 7 minute read #mac, #os-x, #paragon-ntfs, #refit, #ssd, #usb-superdrive, #vmware, #windowsSo recently I swapped out the SuperDrive in my early 2011 MacBook pro for an additional HDD caddy. I then moved my 1TB HDD into the caddy and put a new SSD in the HDD slot. As I was messing around with hard drives I decided to go for a fresh install of OS X and Windows on the SSD.
Fixing "ERROR: Error 35: error:14077458:SSL routines:SSL23_GET_SERVER_HELLO: reason(1112)"
So the other day I came across the following error when using the munki configuration tool for mac.
Sort top command by cpu usage and set to default in OS X
As I come from a linux background but seem to spend more and more of my free time using OS X I keep noticing little differences in the way the command line works on a mac. One difference which as been bugging me recently is that way that the top command orders itself. Iām used to having it ordered by highest processor usage at the top on linux, which I find the most useful as generally when I run top Iām looking to see what is chewing up my cpu. However when you run top in OS X it orders by pid, so the newest processes are at the top.
Why is there no space in the MySQL password parameter?
After troubleshooting a MySQL issue with a colleague we began discussing a “feature” of the MySQL command line which insists that you don’t put a space in the password parameter when using the short parameter. We both felt that it was rather inconsistent to allow the usage of
-h hostname
or-u username
but insist on-ppassword
instead of-p password
. You can of course use the full parameter--password=password
but as most people use the shorthand commands it just seems slightly unintuitive.Using the AddThis Share Buttons wordpress plugin in a custom theme
There is an undocumented function for adding a custom AddThis widget to your Wordpress theme when using the Add This Share Buttons plugin, so I thought I would document it here.
Convincing Paypal Phishing Email
So I woke up yesterday morning to find a receipt from Paypal for $149.49 on my iPhone. I havenāt bought anything for $149.49 so right away I was worried that my Paypal account had been broken into. The first thing I did was to log into my Paypal account on my laptop to check to see more information about the transaction. When logging in there was no trace of this transaction.
Has Stack Overflow been hijacked?
Something funny seems to be happening with Google and Stack Overflow.
I first noticed this last night when on my Macbook at home. I went to www.google.co.uk, typed in “stackoverflow” and was presented by the usual page. However I noticed that the url displyed under the link on the search results said www.doioig.gov. When clicking the link it took me to www.doioig.gov instead of www.stackoverflow.com. I thought to myself that this was probably just a temporary issue and went to the correct url myself.
SSH without a password on OS X with ssh-copy-id
What is ssh-copy-id?
ssh-copy-id is a script that uses ssh to log into a remote machine (presumably using a login password, so password authentication should be enabled, unless youāve done some clever use of multiple identities) It also changes the permissions of the remote userās home,
~/.ssh
, and~/.ssh/authorized_keys
to remove group writability (which would otherwise prevent you from logging in, if the remotesshd
hasStrictModes
set in its configuration).Album Review: Muse 2nd Law
Well I just got “2nd Law” by Muse and as I’m a big Muse fan I thought I’d write a quick review of it. Up until now I’ve enjoyed all of Muse’s albums with the exception of album 5 “The Resistance” which I felt was a little too experimental for my liking.
Using background-position and sprite sheets to stop icon hover flicker
While updating the theme on this blog I added some links to my social websites. I made these links in the form of images which were black and white and become coloured when hovered over. To create these icons I created empty divs which would then be styled in CSS.
Incorrect Gmail password when using exchange on iPhone
The Problem
Recently I’ve tried playing around a little with the email account settings on my iPhone to try and get as much to sync as possible with some of my other accounts. One thing I did was to change Gmail from the standard ‘Gmail’ setting that you get on iOS to the ‘Microsoft Exchange’ setting as recommended by Google.
How to query a database with AJAX and display as a tooltip
This post began as an answer on Stack Overflow to a question on āHow to query a database with AJAX and display as a tooltipā. I have put the answer here for future reference.
Download Festival 2012 Timetable
Download have now released the times for Download Festival 2012 in the form of an app. The only problem with this is itās not exactly clear and easy to see who is clashing or to make a plan. For my own benefit and anyone else who wants it Iāve put the timetable into a spreadsheet and put it in google docs.
Shrinking SQL logs
While reading some articles about minimizing the size of data while leaving the data fully searchable I came across a lot of info about finding common data and replacing it with pointers to a dictionary or store of common strings. After thinking about it for a while I thought I would just write a little post about a way I try and minimize the size of databases in projects that I work on. Now Iām going to start with an example of a very bad database I wrote a while back and how I used relational tables to shrink the size considerably. Now if you know anything about databases you should find all of this quite obvious but if you donāt then it may be worth taking note of.
The lesser known browser war
Feb 7, 2012 3 minute read #browser-war, #browsers, #chome, #firefox, #internet-explorer, #opera, #safariWell itās been quite a while since Iāve written a post on here but due to a suggestion from a friend Iām going to try and make an effort to write more about the projects I work on and the random thoughts that fall out of my brain. A few things have changed since my last post but the main thing would be my job. I am now a software test analyst, aka that guy who has to pick fault with your hard work. Iām enjoying this position and it is actually quite nice finding bugs and knowing youāre not going to have to be the one that fixes them.
Apple game center hacks
Ok Iām currently on holiday with my girlfriend in turkey. It started getting a bit too hot in the sun a few days ago so we went into the hotel lobby and found free wifi. We went to our room and both got out our iOS devices and went to the lobby to catch up with Facebook etc.
Additional HDD Philosphy
A friend has asked me what I think the best way for him to set up his new HDD is. He currently has a 500GB Internal HDD and has purchased a second one. So here is what I would do.
Guide to repairing TFT monitor scratches
About a year and a half ago I was working on a guitar modification and dropped part of the modified casing on my laptop. It made a pretty good scratch on the bottom of my monitor about an inch long (2.54cm for you metrics out there). It has mildly bothered me since then but I’ve never done anything about it. But this evening I decided to google around and see if there was any kind of specialist product that could be purchased and used to fill in the scratch. I read a few blogs and sites until I found the most amazing solution ever.
Ubuntu setup 2010
I am now happy with my Ubuntu installation. After a small hiccup earlier where it decided not to do an update properly, disable my mouse and keyboard and then ask for a password for them to be re-enabled. Solved with a trusty USB mouse and onscreen keyboard. I have managed to find a PHP/HTML/JS editor to my liking. I have looked through a few more since my post last night and have settled on Komodo. It highlights my code properly, auto completes functions and even highlights closing tags when you select the opening one which is going to make my life with tables and divs a hell of a lot easier. Definitly a viable replacement for Dreamweaver as the only thing I missed was the colour scheme which has easily been remedied in the settings.
Back to Ubuntu
For as long as I’ve been interested in computers I’ve been interested in linux. I was introduced to it by a friend at school when I was 13, we installed mandrake in vmware. We only admired it for a few minutes and then deleted it. He later introduced me to ubuntu and he began using it as his only operating system and still does to this day. I personally couldn’t move from Windows mainly due to gaming but also because I can’t stand to lose photoshop, dreamweaver etc. I also have a problem with wine, vmware and virtualbox, why emulate windows when you can just run it.