Removing an S3 bucket

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

I tried to remove a bucket from an S3 compatible object store. The trouble is that it wasn't empty, there were lots of objects inside. I tried minio's client: mc rm --recursive --dangerous --force exo/somebucket But it just seemed to stall after removing ~100 objects. The solution is to run it using timeout in a loop, eg. while true; do timeout 10 mc rm --recursive --dangerous --force exo/somebucket; done

Continue Reading Removing an S3 bucket

Hogyan kommunikálj az Ops csapattal

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

Egy jó írás a Rackforest-től Don’t Throw Your Code Over The Wall: 5 Ways To Work With Ops Engineers címmel. Már korábban is el akartam küldeni, de csak most vettem észre, hogy draft-ban van a cikk. Ne vesszen kárba ez a régi fordítás. 1. Részletes leírás kell Ne csak annyit mondj, hogy pl. mysql kell neked, mondd meg a kívánt/preferált verziót, hogy kell-e replikáció (és ha igen, akkor milyen), stb. Azt is mondd el, mennyi erőforrás kell az alkalmazásodnak (diszk, cpu, memória), egyszóval mindent, hogy az Ops csapat tisztában legyen a projekttel. 2. Használható loggolás Olyan logbejegyzések kellenek, amelyekből kiderül az Ops csapat számára, hogy mi a probléma, merre induljanak el. 3. Legyen rollback terved Ha meglepi van az upgrade után, akkor vissza kell tudni állni az előző, még jól működő verzióra. 4. Világosan kommunikált SLA szükséges Már a HW környezet megtervezésekor tudni kell, hogy hány 9-es rendelkezésre állásra van szükség. A 99,999% elvárt uptime évi 5 perc állást enged meg. Azt is érhetően el kell magyarázni, hogy ha pl. egy reboot miatt kieső idő mérhető bevételkiesést okoz, vagy ha csak alig pár user morog egy kicsit.

Continue Reading Hogyan kommunikálj az Ops csapattal

Using cadvisor to get a peek to Docker

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

You may use the Docker stats API to get some basic information from docker. Try the following to get a json output: curl -s --unix-socket /var/run/docker.sock http://localhost/containers/json | python -m json.tool To get data for a single container: curl -s --unix-socket /var/run/docker.sock http://localhost/containers/c101546a3690/json | python -m json.tool Pros: simple Cons: no aggregation, no visualization. To take it to the next level, give cAdvisor a shot: it taps the Docker API, and gives you a visual and historical data what's going on inside Docker. cAdvisor runs as a docker container: docker run \ --volume=/:/rootfs:ro \ --volume=/var/run:/var/run:rw \ --volume=/sys:/sys:ro \ --volume=/var/lib/docker/:/var/lib/docker:ro \ --publish=8080:8080 \ --detach=true \ --name=cadvisor \ google/cadvisor:latest Then simply visit http://127.0.0.1:8080 Pros: visual Cons: limited timeframe, limited metrics You may also want to put these data to a time series database, eg. InfluxDB or similar, and visualize it with Grafana providing even better visuality and better history. Or you may give chance to sysdig to provide much more. See https://dzone.com/storage/assets/9981079-dzone-refcard236-dockermonitoring.pdf for more on the topic.

Continue Reading Using cadvisor to get a peek to Docker

Query docker container stuff

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

Add the following to Dockerfile: HEALTHCHECK --interval=10s --timeout=3s CMD curl -s smtp://localhost/ || exit 1 $ docker inspect --format "{{.State.Health.Status}}" 0dbed54e70cc healthy $ docker inspect --format "{{.State.Status}}" 0dbed54e70cc running  

Continue Reading Query docker container stuff

Notebook, external monitor, xfce

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

I had an odd issue with Xubuntu (using xfce4). After logging in the external screen went blank until I opened up the notebook screen. Sad. The fix is to disable the Xfce4 power manager service.

Continue Reading Notebook, external monitor, xfce

Retire your old stuff with retire.js

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

I've stumbled in an article on dzone 5 Quick Wins for Securing Continuous Delivery mentioning a javascript library to scan the given webpage for security vulnerabilities using retire.js. Note that it has an addon for Firefox and Chrome as well. So I've installed the firefox addon, and for the good. Because I just learned the terrible truth: jquery versions 1.x and 2.x have some unfixed issues. So I've updated the piler enterprise configs to use the most recent versions of jquery and other js libs from CDN networks, and now the addon is happy for piler enterprise GUI. It's actually a new config option, called JS_CODE in config.php, so you are able to fix it to use local versions of the used js libraries if your users are on a network without access to the Internet.

Continue Reading Retire your old stuff with retire.js

Fixing a corrupt InnoDB database

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

Recently I was asked to help fixing a mysql server issue. The mysql server couldn't start on a somewhat older Ubuntu (=Trusty). I checked /var/log/mysql/error.log, and it said something like that it might be a mysql bug or even the mysql binaries (or libraries?) may not be for this platform. WTF? Finally the customer explained that there had been a power outage, and even the UPS had been failing, resulting a corrupted database. So the innodb database was in a pretty bad shape. OK, let's try to bring it back to life by healing it: mysqld --user=mysql --datadir=/var/lib/mysql --innodb-force-recovery=1 No cigar. I tried up to 4 which is the highest recommended or safe(?) value (a higher than 4 value may permanently corrupt data files) according to the official mysql docs. Still no luck. Because at this point I had nothing (more) to lose (there was no backup of the database, and customer couldn't start mysqld), I took a deep breadth, and told the customer to prepare for even the worst (ie. data loss), and tried --innodb-force-recovery=5, and then --innodb-force-recovery=6. The last attempt was successful in a sense that at least mysqld started, but it was logging the following message in…

Continue Reading Fixing a corrupt InnoDB database

slapd high memory usage in docker

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

I installed slapd in Docker, and it was using 712 MB memory even with a few entries. The fix is to run slapd after ulimit -n 1024, eg. #!/bin/bash ulimit -n 1024 slapd -d3 Starting slapd with such a wrapper script has improved the situation considerably: $ docker stats --no-stream --format \ "table {{.Container}}\t{{.MemUsage}}" slapd CONTAINER MEM USAGE / LIMIT slapd 3.855MiB / 1.944GiB  

Continue Reading slapd high memory usage in docker

Application performance monitoring (APM)

  • Post Author:
  • Post Category:IT
  • Post Comments:0 Comments

Just read a blog series (App in a box) from Peter Hack at https://www.dynatrace.com/news/blog/app-in-a-box/, https://www.dynatrace.com/news/blog/app-in-a-box-customer-perspective/ and https://www.dynatrace.com/news/blog/app-in-a-box-part-3-logs/. Infrastructure monitoring (HW, OS, processes, network) is important, but not enough, because it can't tell about the application health, neither the customers' perspective of your applications. Health checks may tell you whether your application is available or not. However, such tests should be done from a certain "distance", as close to your users as possible. A health check may be fine checked from the next host in the same data center. But what if your host becomes unavailable from the Internet, because the network access of the datacenter is down? Then your green health check results won't help the users. So synthetic tests are best performed from another location, another datacenter, etc. Also note that uptime is not the same as availability. Real-User Monitoring (RUM) helps you understand the behavior of your users better. Using some monitoring tools you may follow your users' journeys on your site to detect behavioral bottlenecks in your applications, and even the need for design optimizations. Developers may identify and fix page load problems and performance bottlenecks in the browser. However, resource usage, customer experience, and availability only…

Continue Reading Application performance monitoring (APM)