Category Archives: Uncategorized

Exploring Telnet: The Retro Tech Still Offering Fun Surprises

In the fast-paced world of modern computing, where sleek interfaces and seamless connectivity reign supreme, it’s easy to forget about the old tools that paved the way for today’s digital marvels. One such tool is Telnet. Though it may seem antiquated now, Telnet has a storied history and even today, offers some unexpectedly fun uses that you can enjoy right from your keyboard.

See related video in my Youtube channel.

What is Telnet?

Telnet, short for “TELetype NETwork,” is one of the earliest protocols used for accessing remote computers over the internet or a local network. Telnet allows users to connect to remote servers and interact with them as if they were local, using a text-based interface. Before graphical user interfaces (GUIs) became the norm, Telnet was a fundamental tool for system administrators, developers, and anyone needing remote access to a computer.

Telnet operates on the client-server model. A Telnet client connects to a Telnet server via the command line or a terminal emulator, and once connected, users can execute commands on the remote machine. It was a revolutionary tool in its time, but it lacks the security features of more modern protocols like SSH (Secure Shell). As a result, Telnet has largely fallen out of favour for secure communications but remains a fascinating relic of the early internet.

Two Fun Uses of Telnet

Despite its outdated nature, Telnet can still provide a surprising amount of entertainment. Here are two fun and nostalgic uses of Telnet that you can try out:

1. Watch Star Wars in ASCII Art

One of the most delightful Easter eggs hidden on the internet is the ability to watch “Star Wars: Episode IV – A New Hope” rendered entirely in ASCII art via Telnet. This project, created by Simon Jansen, captures the magic of the iconic film using nothing but characters from the ASCII table.

How to Watch:

  1. Open your terminal or command prompt.
  2. Type the following command and press Enter: telnet towel.blinkenlights.nl

You will be greeted with a surprisingly detailed rendition of the Star Wars universe, complete with scrolling text and iconic scenes—all crafted with ASCII characters. It’s a testament to the creativity of early internet enthusiasts and a fun way to revisit a classic film.

2. Relive the Max Headroom Phenomenon

Max Headroom, the iconic 1980s character known for his glitchy, computer-generated appearance and stuttering speech, became a symbol of futuristic tech and cyberpunk aesthetics. While Max Headroom’s origins lie in TV, movies, and commercials, you can experience a bit of this retro-futuristic character through Telnet.

How to Connect:

  1. Open your terminal or command prompt.
  2. Type the following command and press Enter: telnet 1984.ws

You’ll be greeted with a Max Headroom emulation that pays homage to the quirky and groundbreaking character. It’s a fun way to dive into the retro-futuristic world that captivated audiences in the 80s.

How to Exit Telnet

While exploring Telnet is fun, knowing how to exit the session is equally important. Exiting Telnet sessions can vary slightly depending on the client and the server configuration, but here are the general steps:

  1. Use the escape sequence:
    • Typically, you can use the escape sequence Ctrl+] (hold Ctrl and press ]). This should bring you to the Telnet command prompt (telnet>).
  2. Close the connection:
    • Once at the Telnet command prompt, type quit or exit and press Enter. This should close the Telnet session and return you to your original command prompt.
  3. Alternative method:
    • If the above methods don’t work, simply closing the terminal or command prompt window will also terminate the Telnet session.
  4. If your console is weird after telnet, run “reset”

Conclusion

Telnet may no longer be the go-to tool for remote computing, but its legacy lives on in unexpected ways. Whether you’re an old-school tech enthusiast or just looking for a bit of nostalgic fun, exploring Telnet can be a rewarding experience. From watching Star Wars in ASCII art to reliving the Max Headroom phenomenon, these hidden gems highlight the enduring creativity and innovation of early internet culture. So, fire up your terminal, connect to a Telnet server, and take a step back in time—you might just be surprised by what you find. And when you’re ready to log off, just remember those simple steps to exit. Happy exploring!

Exploring Steganography with Hidden Unicode Characters

In the digital age, where information security is paramount, steganography has emerged as a fascinating and subtle method for concealing information. Unlike traditional encryption, which transforms data into a seemingly random string, steganography hides information in plain sight. One intriguing technique is the use of hidden Unicode characters in plain text, an approach that combines simplicity with stealth.

Related video from my Youtube channel:

What is Steganography?

Steganography, derived from the Greek words “steganos” (hidden) and “graphein” (to write), is the practice of concealing messages or information within other non-suspicious messages or media. The goal is not to make the hidden information undecipherable but to ensure that it goes unnoticed. Historically, this could mean writing a message in invisible ink between the lines of an innocent letter. In the digital realm, it can involve embedding data in images, audio files, or text.

The Role of Unicode in Text Steganography

Unicode is a universal character encoding standard that allows for text representation from various writing systems. It includes many characters, including letters, numbers, symbols, and control characters. Some of these characters are non-printing or invisible, making them perfect for hiding information within plain text without altering its visible appearance.

How Does Unicode Steganography Work?

Unicode steganography leverages the non-printing characters within the Unicode standard to embed hidden messages in plain text. These characters can be inserted into the text without affecting its readability or format. Here’s a simple breakdown of the process:

  1. Choose Hidden Characters: Unicode offers several invisible characters, such as the zero-width space (U+200B), zero-width non-joiner (U+200C), and zero-width joiner (U+200D). These characters do not render visibly in the text.
  2. Encode the Message: Convert the hidden message into a binary or encoded format. Each bit or group of bits can be represented by a unique combination of invisible characters.
  3. Embed the Message: Insert the invisible characters into the plain text at predetermined positions or intervals, embedding the hidden message within the regular text.
  4. Extract the Message: A recipient who knows the encoding scheme can extract the invisible characters from the text and decode the hidden message.

Example: Hiding a Message

Let’s say we want to hide the message “Hi” within the text “Hello World”. First, we convert “Hi” into binary (using ASCII values):

  • H = 72 = 01001000
  • i = 105 = 01101001

Next, we map these binary values to invisible characters. For simplicity, let’s use the zero-width space (U+200B) for ‘0’ and zero-width non-joiner (U+200C) for ‘1’. The binary for “Hi” becomes a sequence of these characters:

  • H: 01001000 → U+200B U+200C U+200B U+200B U+200C U+200B U+200B U+200B
  • i: 01101001 → U+200B U+200C U+200C U+200B U+200C U+200B U+200B U+200C

We then embed this sequence in the text “Hello World”:

H\u200B\u200C\u200B\u200B\u200C\u200B\u200B\u200B e\u200B\u200C\u200C\u200B\u200C\u200B\u200B\u200C llo World

To the naked eye, “Hello World” appears unchanged, but the hidden message “Hi” is embedded within.

Advantages and Disadvantages

Advantages:

  • Subtlety: The hidden information is invisible to the casual observer.
  • Preserves Original Format: The visible text remains unaltered, maintaining readability and meaning.
  • Easy to Implement: Inserting and extracting hidden characters is straightforward with proper tools.

Disadvantages:

  • Limited Capacity: The amount of data that can be hidden is relatively small.
  • Vulnerability: If the presence of hidden characters is suspected, they can be detected and removed.
  • Dependence on Format: Changes in text formatting or encoding can corrupt the hidden message.

Practical Applications

  1. Secure Communication: Concealing sensitive messages within seemingly innocuous text.
  2. Watermarking: Embedding copyright information in digital documents.
  3. Data Integrity: Adding hidden markers to verify the authenticity of text.

Conclusion

Unicode steganography in plain text with hidden characters offers a clever and discreet way to conceal information. By understanding and utilizing the invisible aspects of Unicode, individuals can enhance their data security practices, ensuring their messages remain hidden in plain sight. As with all security techniques, it’s essential to stay informed about potential vulnerabilities and to use these methods responsibly.

Understanding Canary Tokens

In the realm of cybersecurity, staying ahead of potential threats is paramount. One innovative method that has gained traction in recent years is the use of canary tokens. These digital tripwires are designed to alert organizations to potential breaches and unauthorized access. In this blog post, we’ll explore what canary tokens are, how they work, and why they are becoming an essential tool in the cybersecurity toolkit.

Related video from my channel:

What are Canary Tokens?

Canary tokens, inspired by the canaries historically used in coal mines to detect dangerous gases, are digital markers that serve as early warning systems for unauthorized access or malicious activity. When a canary token is accessed, triggered, or interacted with in any unauthorized manner, it sends an alert to the network administrators, signaling a potential security breach.

These tokens can take various forms, including:

  1. Documents: Files with embedded tracking capabilities.
  2. Web URLs: Links that trigger alerts when visited.
  3. API Keys: Fake credentials that generate warnings when used.
  4. DNS Entries: Domain name entries that alert administrators when queried.

How Do Canary Tokens Work?

The operation of canary tokens is straightforward yet effective. Here’s a typical workflow:

  1. Deployment: Canary tokens are strategically placed within a network, embedded in documents, or distributed in ways that they appear attractive to potential attackers.
  2. Monitoring: The tokens remain dormant until they are accessed or triggered. They are designed to look like genuine assets or credentials, making them appealing targets.
  3. Alerting: When a token is accessed, it sends an alert to the administrators. This alert can be in the form of an email, SMS, or integration with a monitoring system.
  4. Response: Upon receiving an alert, administrators can investigate the breach, determine the extent of the intrusion, and take necessary actions to mitigate the threat.

Why Use Canary Tokens?

Canary tokens offer several advantages that make them a valuable addition to any cybersecurity strategy:

1. Early Detection

Canary tokens provide early warnings of potential security breaches, allowing organizations to respond quickly before significant damage occurs. This proactive approach can prevent data theft, system compromise, and other malicious activities.

2. Simplicity and Low Cost

Implementing canary tokens is relatively simple and cost-effective compared to other cybersecurity measures. They do not require complex infrastructure changes or significant financial investments.

3. Minimal False Positives

Since canary tokens are designed to be accessed only in specific scenarios, the likelihood of false positives is low. Alerts generated by canary tokens are more likely to indicate genuine security incidents.

4. Versatility

Canary tokens can be customized to fit various scenarios and environments. Whether embedded in documents, disguised as login credentials, or hidden in web applications, they can be tailored to meet specific security needs.

5. Psychological Deterrence

The knowledge that canary tokens are in place can act as a psychological deterrent for potential attackers. The risk of triggering an alert and being detected can discourage malicious activities.

Real-World Applications of Canary Tokens

Protecting Sensitive Data

Organizations dealing with sensitive information, such as financial institutions or healthcare providers, can embed canary tokens in critical files. If these files are accessed or exfiltrated, administrators are immediately alerted.

Monitoring Network Intrusions

Canary tokens can be placed within a network to detect unauthorized access. For example, creating a fake administrative login page with a canary token can reveal attempts to gain unauthorized control.

API Security

By deploying canary tokens as fake API keys, organizations can detect and track the misuse of stolen credentials. This helps in identifying compromised systems and taking corrective actions.

Conclusion

In an era where cyber threats are constantly evolving, canary tokens offer a proactive and efficient way to detect and respond to security incidents. Their simplicity, cost-effectiveness, and versatility make them an invaluable tool for organizations looking to bolster their cybersecurity defenses. By incorporating canary tokens into their security strategies, organizations can gain a critical edge in protecting their digital assets and maintaining the integrity of their networks.

Stay vigilant, stay secure, and consider deploying canary tokens as part of your comprehensive cybersecurity strategy.

Understanding PNG Format and Draw.io steganography

Introduction

Portable Network Graphics (PNG) is a popular raster graphics file format known for its lossless compression and wide support across various platforms and applications. In this blog post, we’ll delve into how PNG works, its format structure with a focus on headers and chunks, and how Draw.io leverages these features to embed drawing code within PNG files.

Related video from my Youtube channel:

The PNG Format

PNG was developed to replace the older Graphics Interchange Format (GIF). It offers several advantages, including better compression and support for a wider range of colors and transparency levels. Unlike JPEG, which is a lossy format, PNG preserves the original image quality, making it ideal for images that require precise details, such as text, graphics, and illustrations.

Structure of a PNG File

A PNG file is composed of a series of chunks. Each chunk has a specific function and structure, allowing for flexible and efficient image data storage. Here’s a breakdown of the core components of a PNG file:

  1. PNG Signature: The file starts with an 8-byte signature that identifies the file as a PNG image. This signature is essential for programs to recognize and process the file correctly.
  2. Chunks: Following the signature, the file consists of multiple chunks. Each chunk has four main parts:
    • Length (4 bytes): The length of the data field.
    • Chunk Type (4 bytes): A four-letter ASCII code specifies the chunk type.
    • Chunk Data (variable length): The data contained in the chunk.
    • CRC (4 bytes): A cyclic redundancy check value for error-checking.

There are several critical chunks, including:

  • IHDR (Image Header): Contains basic information about the image, such as width, height, bit depth, color type, compression method, filter method, and interlace method.
  • PLTE (Palette): Defines the color palette used if the image is paletted.
  • IDAT (Image Data): Contains the actual image data, compressed using the zlib algorithm.
  • IEND (Image End): Marks the end of the PNG file.

Additional chunks can store metadata, text information, and other data, enabling extended functionalities.

How Draw.io Embeds Code in PNG Files

Draw.io is an online diagramming tool that allows users to create a wide range of diagrams, from flowcharts to network diagrams. One of its unique features is the ability to embed the diagram’s XML code directly within a PNG file. This makes it easy to share and store diagrams without needing separate files for the image and the underlying code.

Here’s how Draw.io achieves this:

  1. Embedding XML in a PNG: Draw.io takes advantage of PNG’s chunk-based structure by adding a custom chunk that contains the diagram’s XML data. This chunk is typically labeled zTXt or tEXt to indicate compressed or uncompressed textual data, respectively.
  2. Custom Chunk Integration: When a user saves a diagram as a PNG in Draw.io, the application generates the diagram’s XML representation and compresses it if necessary. This XML data is then inserted into a custom chunk within the PNG file.
  3. Reading Embedded Data: When the PNG file is opened in Draw.io, the application scans the chunks, identifies the custom chunk containing the XML data, extracts it, and reconstructs the diagram based on the embedded code.

This seamless integration allows users to benefit from the portability and compatibility of the PNG format while maintaining the ability to edit and update the diagrams within Draw.io.

Conclusion

PNG is a versatile and powerful image format, and its chunk-based structure offers extensive flexibility for embedding additional data. Draw.io leverages this feature to embed the diagram’s XML code directly within PNG files, making it convenient for users to share and edit diagrams without losing any information. Understanding the inner workings of PNG and its structure not only enhances our appreciation for this format but also opens up possibilities for creative and innovative uses in various applications.

Interesting links

https://es.wikipedia.org/wiki/Portable_Network_Graphics

https://github.com/pedrooaugusto/steganography-png

Note: This post has been partly generated with Chat-GPT

Scaling WordPress with helm and k8s

In the ever-evolving landscape of web development and content management, WordPress stands as a steadfast titan, empowering millions of websites with its user-friendly interface and robust features. However, deploying WordPress can sometimes be a challenging task, especially for those new to server management and configuration. Fortunately, with the advent of containerization and orchestration technologies like Kubernetes, deploying WordPress has become more streamlined and efficient than ever before. One such method is leveraging the Bitnami Helm Chart, offering a seamless solution for deploying WordPress on Kubernetes clusters. In this blog post, we’ll explore the process of deploying WordPress using the Bitnami Helm Chart, highlighting its simplicity and effectiveness.

What is Bitnami?

Before delving into the deployment process, let’s take a moment to understand Bitnami. Bitnami is a well-known name in the world of application packaging and deployment automation. They offer a vast library of pre-configured software packages, including popular applications like WordPress, Drupal, Joomla, and many others. These packages are designed to be easily deployable across various platforms, making it convenient for developers and administrators to set up complex applications with minimal effort.

Their WordPress chart is the most active and downloaded amount the ones listed in artifacthub.io

Introducing Helm and Kubernetes

Helm is a package manager for Kubernetes that simplifies the process of deploying, managing, and upgrading applications. It uses charts, which are packages of pre-configured Kubernetes resources, to define the structure of an application. Kubernetes, on the other hand, is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications.

Deploying WordPress with Bitnami Helm Chart

Now, let’s walk through the steps of deploying WordPress using the Bitnami Helm Chart:

  1. Setup Kubernetes Cluster: Before deploying WordPress, you’ll need to have a Kubernetes cluster up and running. This can be a local cluster using tools like Minikube or a cloud-based solution like Google Kubernetes Engine (GKE), Amazon Elastic Kubernetes Service (EKS), or Microsoft Azure Kubernetes Service (AKS).
  2. Install Helm: Install Helm on your local machine or wherever you’ll be running the Helm commands. Helm provides a command-line interface (CLI) for managing charts and releases.
  3. Add Bitnami Repository:
    • Add the Bitnami Helm repository to Helm by running the following command:
    • helm repo add bitnami https://charts.bitnami.com/bitnami
  4. Customize Values (Optional): Optionally, you can customize the values in the values.yaml file to configure aspects of the WordPress deployment, such as resource limits, database credentials, and ingress settings. Make sure you have read their great README to understand the different options you have.
  5. Deploy WordPress:
    • Finally, deploy WordPress using the Bitnami WordPress Helm Chart with the following command: helm install my-wordpress bitnami/wordpress
  6. Access WordPress: Once the deployment is complete, you can access your WordPress site by retrieving the external IP address or domain associated with the WordPress service. Simply navigate to that address in your web browser, and you should see the WordPress installation wizard, allowing you to set up your site.
    • Hint: if you enabled ingress, you can always describe the ingress resource to see how to reach it. Otherwise you need to describe the SVC.

Benefits of Using Bitnami Helm Chart for WordPress

Deploying WordPress with the Bitnami Helm Chart offers several advantages:

  • Simplified Deployment: The Helm Chart abstracts away the complexity of deploying WordPress on Kubernetes, making it accessible to developers of all skill levels.
  • Consistency: Bitnami’s extensive experience in packaging applications ensures that the WordPress deployment is reliable and consistent across different environments.
  • Customization: While the default configuration works out of the box, you have the flexibility to customize various aspects of the deployment to suit your specific requirements.
  • Scalability: Kubernetes enables seamless scaling of WordPress instances to handle varying levels of traffic and workload.

A common use case example

Let’s say you want to deploy WordPress with high availability being able to scale horizontally. Checking the README you will want to increase the replicaCount from default 1 to N.

This figure summarizes the components we would have:

  • Ingress: you might need a more complex ingress configuration if you want to enforce security with network.
  • WordPress pods: instead of having a single replica, you will want N, being able to grow automatically.
  • Mysql service: here lives most of your WordpPress state, except uploads.
  • Memcached: make your frontend fast! Avoid touching the DB over and over again for the same posts.

Related config example:

autoscaling:
    enabled: true
    minReplicas: 3
    maxReplicas: 9
    targetCPU: 50
    targetMemory: 50

However, once you can have N pods you need common storage for certain things. If you are lucky with the requirements you’d better not offering installing plugins from the interface and you should burn them in a custom image or in a customPostInitScript. That way you can have this config which only uses the shared volume for uploads and config:

  extraEnvVars:
  - name: WORDPRESS_DATA_TO_PERSIST
    # Note: we avoid persisting plugins/themes for performance reasons
    value: "wp-config.php wp-content/uploads"

If you need to offer plugin installation through the admin interface it will mean you will need to use a really fast volume for that. E.g Azure Files is really bad for that because of all those tiny PHP files, even using the premium offering. I thought OP Cache would limit the impact but it was not enough, leave a comment if you know certain tweak related to this as I was unable to make it well enough and the admin interface was horrible to use. At least the user facing part can be easily cached thought.

Lastly, you really want to enable Memcached. You need to use a deployed Memcached pod or you can use an external service. You will need to use the W3 Total cache plugin so that you can take advantage of it.

  memcached:
    enabled: true

Common pitfalls and solutions

Troubleshooting hints

You might be reproducing performance issues, the best thing you can do is deploying root run pods in DEV so that you can add a few var_dumps or even installing xdebug which will find the culprit for sure:

Note: Be aware that this is horrible for production envs. I recommend only enabling it in local/DEV k8s!!

  # Configuration to run wordpress as root.
  # Only enable for troubleshooting, e.g profiling with xdebug
  #podSecurityContext:
  #  enabled: true
  #  fsGroup: 0
  #containerSecurityContext:
  #  runAsNonRoot: false
  #  runAsGroup: 0
  #  runAsUser: 0
  #  readOnlyRootFilesystem: false
  #  privileged: true
  #  allowPrivilegeEscalation: true

You might also need to disable health checks so that you can debug stuff there:

Note: same note, only for local/DEV envs.

  # Health checks override, only set as false for troubleshooting
  #livenessProbe:
  #  enabled: false
  #readinessProbe:
  #  enabled: false
  #startupProbe:
  #  enabled: false

Populating the volume and editing wp-config.php

You probably need to fill the /uploads folder or tweak the wp-config.php file. Just use kubectl cp.

About wp-config.php persistence

The config file is generated according to the Values.yaml when helm install is run but not with upgrade, that is an expected behaviour. However, at least you can override the database config, which is a common thing you might need to change:

  # We are persisting wp-config.php but we need to update the DB when needed
  overrideDatabaseSettings: yes

Additionally, if you need to update the wp-config.php file you can use kubectl cp. An alternative would be using a secret for the config instead (check existingWordPressConfigurationSecret in the README).

existingWordPressConfigurationSecret: "wp-config-secret"

Running customPostInitScripts every time the pods are created.

You can try this workaround, thank me in the comments or provide a better solution if you know it please:

    my-script.sh: |
      #!/bin/bash
      set -x
      # Plugins repository is https://wordpress.org/plugins
      #export WP_CLI_PACKAGES_DIR=/bitnami/wordpress/wpcli-packages
      # Workaround for https://github.com/bitnami/charts/issues/21216
      (sleep 10 && rm -f /bitnami/wordpress/.user_scripts_initialized)&
      echo "Finished my-script.sh"

Customizing more stuff

If the bitnami chart values.yaml is not enough for you use case you can always create your own chart which uses the bitnami one as a child. E.g that way you can have your own ingress.yaml file:

apiVersion: v2
name: my-wordpress-chart
description: My WordPress chart
type: application
version: 1.0.0
appVersion: 1.0.0

dependencies:
  - name: wordpress
    version: X.Y.Z
    repository: oci://registry-1.docker.io/bitnamicharts

You can also fork the chart easily just copying locally and using a local reference instead of OCI. That is also a solution if you want to make sure you don’t depend on docker.io for chart retrieval.

Further than that you can build and use your own WordPress docker image and repository.

Note: never customize a docker image in order to burn default config containing secrets!

Unable to connect to Azure Mysql

If you cannot connect to Azure Mysql even though host and credentia are fine. Just try this and you’re welcome:

  wordpressExtraConfigContent: "define('MYSQL_CLIENT_FLAGS', MYSQLI_CLIENT_SSL);"

Conclusion

The Bitnami Helm Chart provides a hassle-free solution for deploying WordPress on Kubernetes, allowing developers to focus on building and managing their websites without getting bogged down by infrastructure concerns. By leveraging the power of Helm and Kubernetes, deploying WordPress has never been easier or more efficient. Whether you’re a seasoned Kubernetes pro or just getting started, the Bitnami Helm Chart for WordPress is a valuable tool in your arsenal for modern web development. However, there are different use cases that require different configurations, and you’ll need to work on that.

About this blog post

“A common use case example” and “Common pitfalls and solutions” have been 100% written by humans, whereas the rest of the blog post has been generated with LLM and tweaked a bit with extra details.

A robot transcribing a big mp3

HOWTO transcribe from MP4 to TXT with Whisper AI

In an era where information is constantly flowing through various forms of media, the need to extract and transcribe audio content has become increasingly important. Whether you’re a journalist, a content creator, or simply someone looking to convert spoken words into written text, the process of transcribing audio can be a game-changer. In this guide, we’ll explore how to transcribe audio from an MP4 file to text using Whisper AI, a powerful automatic speech recognition (ASR) system developed by OpenAI.

Related video from my Youtube channel:

What is Whisper AI?

Whisper AI is an advanced ASR system designed to convert spoken language into written text. It has been trained on an extensive dataset, making it capable of handling various languages and accents. Whisper AI has numerous applications, including transcription services, voice assistants, and more. In this guide, we will focus on using it for transcribing audio from MP4 files to text.

Prerequisites

Before you can start transcribing MP4 files with Whisper AI, make sure you have the following prerequisites in place:

  1. Docker: Docker is a platform for developing, shipping, and running applications in containers. You’ll need Docker installed on your system. If you don’t have it, you can download and install Docker.
  2. MP4 to MP3 Conversion: Whisper AI currently accepts MP3 audio files as input. If your audio is in MP4 format, you’ll need to convert it to MP3 first. There are various tools available for this purpose. You can use FFmpeg for a reliable and versatile conversion process.

fmpeg -i 20230523_111106-Meeting\ Recording.mp4 20230523_111106-Meeting\ Recording.mp3

Transcribing MP4 to TXT with Whisper AI

Now, let’s walk through the steps to transcribe an MP4 file to text using Whisper AI. We’ll assume you already have your MP4 file converted to MP3.

Step 1: Clone the Whisper AI Docker Repository

First, clone the Whisper AI Docker repository to your local machine. Open a terminal and run the following command:

git clone https://github.com/hisano/openai-whisper-on-docker.git

Step 2: Navigate to the Repository

Change your current directory to the cloned repository:

cd openai-whisper-on-docker

Step 3: Build the Docker Image

Build the Docker image for Whisper AI with the following command:

docker image build --tag whisper:latest .

Step 4: Set Up Volume and File Name

Set the VOLUME_DIRECTORY to your current directory and specify the name of your MP3 file. In this example, we’ll use “hello.mp3”:

VOLUME_DIRECTORY=$(pwd)

FILE_NAME=hello.mp3

Step 5: Copy Your MP3 File

Copy your MP3 file (the one you want to transcribe) to the current directory.

cp ../20230503_094932-Meeting\ Recording.mp3 ./$FILE_NAME

Step 6: Transcribe the MP3 File

Finally, use the following command to transcribe the MP3 file to text using Whisper AI. In this example, we’re specifying the model as “small” and the language as “Spanish.” Adjust these parameters according to your needs:

docker container run --rm --volume ${VOLUME_DIRECTORY}:/data whisper --model small --language Spanish /data/$FILE_NAME

Once you execute this command, Whisper AI will process the audio file and provide you with the transcribed text output.

You’ll see transcription is outputted through stdout so consider piping the docker run to a file.

docker container run --rm --volume ${VOLUME_DIRECTORY}:/data whisper --model small --language Spanish /data/$FILE_NAME &> result.txt

You can monitor how it goes with:

tail -f result.txt

If you see a warning like:

/usr/local/lib/python3.9/site-packages/whisper/transcribe.py:114: UserWarning: FP16 is not supported on CPU; using FP32 instead

It will mean that you lack a CUDA setup so it will run using your CPU.

Also notice that here we’re using the small model, which is good enough but perhaps too slow with CPU usage. In my machine, it takes like 2.5 hours to transcribe 3 hours of audio.

Conclusion

Transcribing audio from MP4 to text has never been easier, thanks to Whisper AI and the power of Docker. With this guide, you can efficiently convert spoken content into written text, opening up a world of possibilities for content creation, research, and more. Experiment with different Whisper AI models and languages to tailor your transcription experience to your specific needs. Happy transcribing!

Note: I’ve written this blog post with the help of ChatGPT based on my own experiments with Whisper AI. I’m just too lazy to write something coherent in English. Sorry for that, I hope you liked it anyway.


Prompt: “Write a blog post whose title is HOWTO transcribe from mp4 to txt with Whisper AI. It should explain what Whisper AI is but also explain how to extract mp3 from mp4, and the following commands, ignore first column: 10054 git clone https://github.com/hisano/openai-whisper-on-docker.git 10055 cd openai-whisper-on-docker 10056 docker image build –tag whisper:latest . 10057 VOLUME_DIRECTORY=$(pwd) 10058 FILE_NAME=hello.mp3 10059 cp ../20230503_094932-Meeting\ Recording.mp3 ./hello.mp3 10060 docker container run –rm –volume ${VOLUME_DIRECTORY}:/data whisper –model small –language Spanish /data/hello.mp3” . After that, I added some extra useful information about performance.

My AMD Hackintosh OpenCore triple boot in same disk notes

A few notes about the main points I learnt installing triple boot into my new PC:

  1. When picking the hardware components, search for success stories related to such components so that you make sure they’re compatible and someone has already prepared configuration you can work on instead of building the setup from zero. E.g Non APU Ryzen (without G) + Gigabyte X570 + Radeon RX580
  2. Be aware that if you want to use Hackintosh as your only OS, intel will be easier and better supported, e.g docker with hypervisor, Adobe suite… My idea is using Linux, leaving OSX option for Xcode and Windows10 for gaming and win-only software.
  3. OpenCore is currently the only option for AMD, do not lose time reading about clover. See this video as an intro, not enough to get into action but you’ll get a general idea: https://www.youtube.com/watch?v=l_QPLl81GrY
  4. You can lose data quite easily, e.g touching partitions, so make sure you backup if needed.
  5. Make sure you read this guide carefully, it’s more precise and updated than the video: https://dortania.github.io/OpenCore-Install-Guide/
  6. This guide is also quite interesting: https://github.com/alkalim/ryzen-catalina
  7. Once you’ve seen the video and read the guide you’ll be ready if you understand these topics: Boot USB, STDT, ACPI, KEXT, UEFI, config.plist, SMBIOS
  8. If you find someone who already succeeded with your same CPU + Motherboard (e.g lucky me!) it will be way more easier to setup, as you might avoid the pain of testing different kexts and configs) but you still need to make sure you understand what you’re doing (previous points). Otherwise your Mac install menu will appear in Russian and you’ll have to figure out why that happens and how to reset NVRAM.
  9. You need to installs OSs in this order: Windows, Linux, Mac (3 pendrives). Both Windows and Linux need to be running in UEFI mode, and once both are running like that, you’ll need to resize the UEFI partition to at least 200MB as it’s a Mac requirement. (EFI created by default by Windows is 100MB…)
  10. You also need a Gparted USB so that you can create the Mac partition with the free space that you left after installing Windows and Linux, you’ll use HPFS+ but in Mac install partitions tool you’ll need to enable journaling for it (File > Enable Journaling) and convert it to APFS. Otherwise it will complain about lack of “firmware partition” (UEFI) even though you had already prepared it.
  11. In the middle of the installation it will reboot without warning and restart going on the installation from the disk.
  12. If the latest Realtek kext does not work for you, e.g unable to configure NIC on installation, try with v2.2.2, it did the trick for me.
  13. Once successfully installed you typically need to do a few postinstall things:
    1. Just in case Windows update messes up with opencore boot loader make sure you install BootStrap.efi in BIOS. That way you’ll always have the “OpenCore” option in BIOS.
    2. You need to update the hard disk UEFI partition. If you prepare the USB BOOT MAC drive with gibmacos you might not have an EFI partition there, you just need to mount the EFI hard disk partition manually, delete its EFI folder and drop the one you have in the USB BOOT.
    3. If OpenCore is unable to detect Linux, make sure you installed it in UEFI mode, e.g in Linux mint picking the UEFI partition as boot partition.

Enjoy!

Other links:

https://github.com/ivmos/config/tree/master/ryzentosh

https://github.com/sergeycherepanov/homebrew-docker-virtualbox

Designing Data-Intensive Applications Book: Chapter 1 Summary

I start a series of blog posts with summaries about this interesting book: Designing Data-Intensive Applications

51gP9mXEqWL._SX379_BO1,204,203,200_.jpg

What is a data-intensive application?

It s an application where raw CPU power is rarely a limiting factor and the problems are the amount of data, the complexity of data, and the speed at which it changes. It is built from standard building blocks that provide commonly needed functionality.

In this chapter, we see the fundamentals of what we are trying to achieve.

Continue reading

Google Photos API, how to use it and why it will probably disappoint you

photos_96dp

 

Recently I needed to close a Google Apps account, and I tried to migrate albums programmatically.  I’ll document here the needed steps and explain why this Google API is useless for most of us:

First you need an app token, you can get it from Google Console on https://console.developers.google.com. There you need to register your project and associate API from the library.

You should now have both client_id and client_secret so you can fetch the code quite easily with a OAUTH2 flow:

#!/bin/sh

$client_id = "foo"
$client_secret= "bar"

scope="https://www.googleapis.com/auth/photoslibrary+https://www.googleapis.com/auth/photoslibrary.sharing"

url="https://accounts.google.com/o/oauth2/auth?client_id=$client_id&redirect_uri=urn:ietf:wg:oauth:2.0:oob&scope=$scope&response_type=code"

echo $url

If you open such output URL with a browser you’ll get the $code and with such code, you can just fetch the tokens.

$code = "lol"
curl --request POST --data "code=$code&client_id=$client_id&client_secret=$client_secret&redirect_uri=urn:ietf:wg:oauth:2.0:oob&grant_type=authorization_code" https://accounts.google.com/o/oauth2/token

With the refresh_code you can already do what you need, here you have an example kotlin script I worked on https://gist.github.com/ivmos/860b5db0ffeeeba8ffd33adebfaaa094

But finally, I just did it manually zooming out from web client. It happens that Google just offers consents that allow you to manipulate photos and albums created with your own app, so you can’t move around photos between albums created by the official too. This means you cannot organize your library automatically unless you just need to work with photos you would upload with your own app…

 

 

HOWTO see Google Calendar events in yearly view

245px-google_calendar

It turns out I already have booked a few events for 2019 so I wanted to have a yearly view of everything I have. I was disappointed to see that current Google Calendar yearly events view is useless as It’s just empty. There are lots of comments about this issue in this Google productforums entry.

Captura de pantalla 2019-01-12 a las 13.07.50.png

:S

Captura de pantalla 2019-01-12 a las 13.15.33.png

Ron Irrelavent is absolutely right

So I did some search to look for solutions and I found these 2:

  • Google Calendar Plus extension
    • I haven’t even tried as I’m tired of Chrome extensions but seems to work.
  • Visual-Planner project
    • A bit ugly but it works and it’s open-source so this is what I’m using. You can use it without installing it here (you just need to OAUTH to your gmail account). Only drawback is that it does not display names for multi-day events, as a workaround you can create a single event for the first day e.g “Flight to London”

Captura de pantalla 2019-01-12 a las 13.13.09.png

This is something ¯\_(ツ)_/¯

Let me know if you have better alternatives.

I also hope Google implements this..  Hello Google PMs? 🙂