#python
Tag

Python version epochs are broken

May 1, 2024 5 minute read #python, #versioning

In PEP440 Python introduced Version Epochs as a mechanism to allow projects to change versioning scheme. Unfortunately there’s no way I could see a project actually making use of this without confusing their users.
A beginner's guide to managing Kubernetes resources in Python with kr8s

Mar 11, 2024 5 minute read #python, #kubernetes, #tutorial, #kr8s

Managing Kubernetes resources with Python has never been easier thanks to the kr8s Kubernetes client for Python.
Running Dask on Databricks

Feb 15, 2024 4 minute read #python, #dask, #databricks, #distributed-computing, #cloud-deployment

Databricks is a very popular data analytics platform used by data scientists, engineers, and businesses around the world. It was founded by the creators of Apache Spark, a powerful open-source data processing engine, and builds on top of Spark to provide a comprehensive analytics platform.
Running Dask workloads on multiple cluster backends with zero code changes using dask-ctl

Jan 25, 2024 3 minute read #python, #dask, #dask-ctl, #distributed-computing

Sometimes you want to write some code using Dask which can then be run against multiple different cluster backends. For example for local testing you might want to use LocalCLuster, but in production use KubeCluster. Or perhaps you want to easily switch between an on premise HPC with SLURMRunner or the cloud with Coiled.
EffVer: Version your code by the effort required to upgrade

Jan 15, 2024 11 minute read #python, #versioning

Version numbers are hard to get right. Semantic Versioning (SemVer) communicates backward compatibility via version numbers which often lead to a false sense of security and broken promises. Calendar Versioning (CalVer) sits at the other extreme of communicating almost no useful information at all.
How to get typer to show help by default

Jan 10, 2024 2 minute read #typer, #python, #cli, #quick-tips

I love using typer for creating CLI tools in Python. It makes creating complex trees of subcommands really straightforward.
Comparison of kr8s vs other Python libraries for Kubernetes

Sep 4, 2023 11 minute read #kubernetes, #python, #kr8s, #lightkube, #pykube-ng, #kubernetes-asyncio

I’ve been working on kr8s for a while now and one of my core goals is to build a Python library for Kubernetes that is the most simple, readable and produces the most maintainable code. It should enable folks to write dumb code when working with Kubernetes.
Livestream notes: Replacing aiohttp with httpx in kr8s

Jul 4, 2023 1 minute read #coding, #python, #kr8s, #httpx, #asyncio, #trio, #aiohttp, #kubernetes

This post will be updated with notes from the livestream throughout the day.

Today I will be streaming some open source code refactoring. Come and join in on Twitch!. Don’t forget to say hi in the chat 😊.
Introducing kr8s, a new Kubernetes client library for Python inspired by kubectl

Jun 19, 2023 7 minute read #project, #kubernetes, #python, #kr8s, #kubectl-ng

For the last few months I’ve been tinkering with a new Kubernetes client library for Python called kr8s.
Debugging Data Science workflows at scale

May 12, 2023 15 minute read #python, #dask, #kubernetes, #apache-beam, #google-cloud, #google-kubernetes-engine

The more we scale up our workloads the more we run into bugs that only appear at scale. Reproducing these bugs can be expensive, time consuming and error prone. In order to report a bug on a GitHub repo you generally need to isolate the bug and come up with a minimal reproducer so that the maintainer can investigate. But what if a minimal reproducer requires hundreds of servers to isolate and replicate?
Sometimes I regret using CalVer

Jan 16, 2023 12 minute read #software-development, #open-source, #python, #semver, #calver

Over the last few years, many open-source Python projects that I work on have switched to CalVer. I’ve felt some pain around this, particularly in Dask and its subprojects. I want to unpack some of my thoughts and feelings around this trend.
Using Dask on KubeFlow with the Dask Kubernetes Operator

Jul 27, 2022 10 minute read #python, #dask, #kubernetes, #kubeflow, #operator, #data-science

Kubeflow is a popular Machine Learning and MLOps platform built on Kubernetes for designing and running Machine Learning pipelines for training models and providing inference services. It has a notebook service that lets you launch interactive Jupyter servers (and more) on your Kubernetes cluster as well as a pipeline service with a DSL library written in Python for designing and building repeatable workflows. It also has tools for hyperparameter tuning and running model inference servers, everything you need to build a robust ML service.
How to set environment variables on your Dask workers

May 5, 2022 2 minute read #python, #dask, #snippet

When working with Dask clusters you often need the remote worker environment to match you local environment. This generally means having the same packages and data available.
Branding your open source Python package

Jul 9, 2021 12 minute read #python, #github, #tutorial

Having a brand can help give your open source project some legitimacy, and you don’t need to be a designer to see these benefits. However it is important to understand that you do not need to add branding to your project in order for it to be successful, and adding branding can even harm your project.
The evolution of a Dask Distributed user

Jun 1, 2021 9 minute read #dask, #python, #distributed, #user-journey Archive

This week was the 2021 Dask Summit and one of the workshops that we ran covered many deployment options for Dask Distributed.
Building a contributor community for your open source project

Apr 30, 2021 10 minute read #python, #github, #tutorial

With our open source project published on GitHub we probably want to allow folks to contribute changes. Some users of the project may find bugs, or desire extra features and will open issues to tell you. Users who have the skills required to make that change can open a Pull Request on GitHub to propose it. As the maintainer you can then review and merge those changes.
Communicating with your open source community

Apr 23, 2021 7 minute read #python, #github, #tutorial

Once your open source Python project has users and a community you will likely want to communicate with them in an official capacity. Perhaps you want to tell them about a new release, show a use case where someone is using your tool or solicit feedback on an upcoming feature.
Building a user community for your open source project

Apr 16, 2021 11 minute read #python, #github, #tutorial

Now that our open source Python project exists and users can install it we will want to turn our attention to sustainability, reach and ongoing maintenance. By putting it out there and gaining users you are opening yourself up to questions, bug reports and feature requests.
Documenting Python projects with Sphinx and Read the Docs

Apr 9, 2021 7 minute read #python, #github, #tutorial

In part four of this series we discussed documenting our code as we went along by adding docstrings throughout out project. In this post we will see that effort pay off by building a documentation site using Sphinx which will leverage all of our existing docstrings.
Automating releases of Python packages with GitHub Actions

Mar 26, 2021 7 minute read #python, #github, #tutorial

In this post we will cover automatically packaging and releasing our project when a new git tag is pushed to GitHub.
Testing and Continuous Integration for Python packages with GitHub Actions

Mar 19, 2021 10 minute read #python, #github, #tutorial

In this post we will cover automatically running our tests when we push new code to GitHub, and when contributors raise Pull Requests against our project.
Awaitable Objects and Async Context Managers in Python

Mar 17, 2021 3 minute read #python, #asyncio, #tutorial

Python objects are synchronous by default. When working with asyncio if we create an object the __init__ is a regular function and we cannot do any async work in here.
Test driven development in Python

Mar 12, 2021 7 minute read #python, #github, #tutorial

What is test driven development (TDD)?

Test driven development is a style of development where you write your tests before you write your code.
Testing your Python package

Jan 22, 2021 7 minute read #python, #github, #tutorial

In this post we will cover testing our code.

Testing

There are many many great resources out there for learning about testing software. In this post I’m going to try and focus on simple examples that you can use to get started quickly. Once you have a good foundation for your tests you can then dive into mocking, replaying HTTP requests or even hypothesis testing.
Documenting your Python code

Jan 15, 2021 4 minute read #python, #github, #tutorial

This post will cover documenting our code. Specifically adding documentation within the code itself.

Docstrings

Right now our code is undocumented, so if the user inspects our function they will only see the interface (the way you call it) but with no other context. We can use IPython to quickly inspect this.
Running Dask tutorials

Aug 21, 2020 20 minute read #python, #dask, #distributed-computing, #open-source, #community, #tutorials Archive

Originally published on the Dask blog on August 21st, 2020.

For the last couple of months we’ve been running community tutorials every three weeks or so. The response from the community has been great and we’ve had 50-100 people at each 90 minute session.
The current state of distributed Dask clusters

Jul 23, 2020 19 minute read #python, #dask, #distributed-computing Archive

Originally published on the Dask blog on July 23rd, 2020.

Dask enables you to build up a graph of the computation you want to perform and then executes it in parallel for you. This is great for making best use of your computer’s hardware. It is also great when you want to expand beyond the limits of a single machine.
Publishing open source Python packages on GitHub, PyPI and Conda Forge

Feb 28, 2020 10 minute read #python, #github, #tutorial, #packaging, #pypi, #conda-forge, #anaconda

In this post we will cover making our code available to people. This is the bit where we open the source! We will push our code to a code posting platform and then package up our library and submit it to a couple of repositories to make it easy for people to install.
Versioning and formatting your Python code

Feb 14, 2020 8 minute read #python, #github, #tutorial, #black, #versioneer, #semver

In this post, we will cover a few project hygiene things that we may want to put into place to make our lives easier in the future.
Testing static sites with Lighthouse CI and GitHub Actions

Feb 13, 2020 7 minute read #python, #github, #tutorial, #github-actions, #static-sites, #lighthouse-ci

When you build a website you want pages to load as quickly as possible for users. Google has a tool called PageSpeed Insights which you can run on your website to see various metrics about the page. I’ve used it in the past while working on my blog and other sites.
Creating an open source Python project from scratch

Feb 7, 2020 9 minute read #python, #github, #tutorial, #git, #oss-licensing

Have you had a great idea for an open-source Python library that you think people will find useful, but you don’t know where to begin in creating and publishing it?
Creating GitHub Actions in Python

Dec 9, 2019 14 minute read #python, #github-actions, #tutorial

Note: This post is also available in Go flavour.

GitHub Actions provide a way to automate your software development workflows on GitHub. This includes traditional CI/CD tasks on all three major operating systems such as running test suites, building applications and publishing packages. But it also includes automated greetings for new contributors, labelling pull requests based on the files changed, or even creating cron jobs to perform scheduled tasks.
Cleaning up conda environments

Aug 23, 2019 1 minute read #conda, #python, #hygine

Often when I’m developing or debugging in Python I end up creating throw away conda environments. They will be to test some package installation or combination of packages and once I’ve finished I will probably never use them again.
ChatOps - Automation via chat

Dec 19, 2017 8 minute read #worksops, #chatops, #opsdroid, #python Archive

Originally published on the Met Office Informatics Lab blog on December 19th, 2017.

ChatOps - Automation via chat

This article is a companion to a workshop on using chat to automate ops workflows. This is a static version of a Jupyter Notebook which you can download here.
Getting started with VMwares ESXi/vSphere API in Python

Jun 22, 2016 2 minute read #vmware, #python, #api

In 2013 VMware dropped their Python library for accessing the API for ESXi/vSphere on GitHub. This is great, however it isn’t the easiest library in the world to use. This quick guide will show you how to connect to an ESXi host or vSphere cluster and get some info about a virtual machine.
How to easy_install and pip through a proxy

Nov 25, 2014 1 minute read #python, #easy_install, #pip, #proxy

If you’re trying to install a Python package using easy_install or pip and you connect to the internet via a proxy you’ll need to make a few changes to your setup.
Python script: Recursively remove empty folders/directories

Feb 16, 2014 2 minute read #module, #python, #script

So as part of a script I’m writing I needed the ability to recursively remove empty folders/directories from a filesystem. After a bit of googling I found this very useful script by Eneko Alonso. However the script isn’t really in a usable state for what I want so I decided to make a few changes to it and publish it on GitHub.

#python Tag

What is test driven development (TDD)?

Testing

Docstrings

ChatOps - Automation via chat

#python
Tag