Scalibur: Reading Body Composition from a Cheap Bluetooth Scale

A journey through BLE packet sniffing, protocol reverse-engineering, and Raspberry Pi deployment

python

iot

raspberry-pi

health

data

ble

hardware

How I built a Raspberry Pi dashboard to capture and visualize body composition data from a GoodPharm TY5108 Bluetooth scale

Author

Guy Freeman

Published

December 22, 2025

Modified

December 31, 2025

Body composition scales are genuinely useful devices. Step on, wait a few seconds, and you get weight, body fat percentage, muscle mass, and a handful of other metrics. The problem? Your data disappears into whatever proprietary app the manufacturer built, usually with aggressive upsells and questionable privacy practices.

I bought a cheap GoodPharm TY5108 scale (it advertises itself as “tzc” over Bluetooth LE) for around £20. It works well enough, but I wanted to own my data. So I built Scalibur: a Raspberry Pi-based system that captures the raw BLE advertisements, decodes the measurements, calculates body composition metrics, and displays everything on a simple web dashboard.

Note

For a less technical take on why I built this, see On Owning Your Data.

This is the story of how I got there.

The Hardware

The setup is straightforward:

GoodPharm TY5108 scale - A cheap body composition scale with BLE. It measures weight using load cells and impedance through electrodes on the surface (you need to stand barefoot for the full reading).
Raspberry Pi - Any model with Bluetooth LE works. I used a Pi 4, but a Pi Zero W would be fine for this.

The scale broadcasts its measurements as BLE advertisements. Unlike connected BLE devices that require pairing and a persistent connection, advertisement-based devices just shout their data into the void. Anyone listening can pick it up.

The Journey: Reverse-Engineering the Protocol

My first attempt used bleak, a popular Python BLE library. I could see the device advertising, but the high-level APIs abstracted away the manufacturer-specific data I needed. The scale wasn’t exposing GATT services in the usual way - all the interesting data was in the advertisement packets themselves.

I switched to aioblescan, which gives you raw HCI (Host Controller Interface) packets. This is the low-level interface between the Bluetooth controller and the host system. Here’s where it gets interesting:

def parse_hci_packet(data: bytes) -> tuple[str | None, int | None, bytes | None]:
    """Parse raw HCI LE advertising packet."""
    if len(data) < 14 or data[0:2] != b'\x04\x3e':
        return None, None, None

    if data[3] != 0x02:  # Not LE Advertising Report
        return None, None, None

    adv_len = data[13]
    adv_data = data[14:14 + adv_len]

    device_name = None
    manufacturer_id = None
    manufacturer_data = None

    i = 0
    while i < len(adv_data):
        if i + 1 >= len(adv_data):
            break
        length = adv_data[i]
        if length == 0 or i + length >= len(adv_data):
            break
        ad_type = adv_data[i + 1]
        ad_value = adv_data[i + 2:i + 1 + length]

        if ad_type == 0x09:  # Complete Local Name
            device_name = ad_value.decode('utf-8')
        elif ad_type == 0xFF and len(ad_value) >= 2:  # Manufacturer Specific Data
            manufacturer_id = int.from_bytes(ad_value[0:2], "little")
            manufacturer_data = ad_value[2:]

        i += 1 + length

    return device_name, manufacturer_id, manufacturer_data

The BLE advertising data structure is a series of TLV (Type-Length-Value) entries. We’re looking for type 0xFF - manufacturer-specific data - which contains the actual scale reading.

The Debugging Saga

Getting the byte offsets right took several iterations. My git history tells the story:

f477996 Fix packet decoding: weight in data bytes, not manufacturer_id
28de4a4 Fix packet decoding: weight is in manufacturer ID field

I initially thought the weight was encoded in the manufacturer ID field (the first two bytes of the manufacturer-specific data). Then I thought the opposite. Both were wrong in different ways.

The actual layout, once I figured it out:

Bytes	Description
0-1	Weight (big-endian, divide by 10 for kg)
2-3	Impedance (big-endian, divide by 10 for ohms; 0 = not measured)
4-5	User ID
6	Status: 0x20 = weight only, 0x21 = weight + impedance
7+	MAC address (ignored)

The status byte was crucial. The scale first broadcasts 0x20 when it has a stable weight reading, then 0x21 once it’s also measured impedance (which takes a few more seconds and requires bare feet contact with the electrodes).

def decode_packet(manufacturer_id: int, manufacturer_data: bytes) -> ScaleReading | None:
    """Decode tzc scale advertisement packet."""
    if len(manufacturer_data) < 7:
        return None

    weight_raw = int.from_bytes(manufacturer_data[0:2], "big")
    weight_kg = weight_raw / 10

    impedance_raw = int.from_bytes(manufacturer_data[2:4], "big")
    impedance_ohm = impedance_raw / 10 if impedance_raw > 0 else None

    user_id = int.from_bytes(manufacturer_data[4:6], "big")

    status = manufacturer_data[6]
    is_complete = status == 0x21 or (status == 0x20 and impedance_raw == 0)

    return ScaleReading(
        weight_kg=weight_kg,
        impedance_raw=impedance_raw,
        impedance_ohm=impedance_ohm,
        user_id=user_id,
        is_complete=is_complete,
        is_locked=is_complete,
    )

Body Composition Math

Once you have weight and impedance, you can calculate body composition using BIA (Bioelectrical Impedance Analysis) formulas. These are well-documented - I used formulas compatible with openScale, an open-source Android app for body composition scales.

The key insight is that lean tissue conducts electricity better than fat. By measuring the body’s impedance and knowing height/age/gender, you can estimate lean body mass:

def calculate_body_composition(
    weight_kg: float,
    impedance_ohm: float,
    height_cm: int,
    age: int,
    gender: str,
) -> BodyComposition:
    """Calculate body composition using standard BIA formulas."""
    height_sq = height_cm**2

    # Lean Body Mass
    if gender == "male":
        lbm = 0.485 * (height_sq / impedance_ohm) + 0.338 * weight_kg + 5.32
    else:
        lbm = 0.474 * (height_sq / impedance_ohm) + 0.180 * weight_kg + 5.03

    # Body Fat
    fat_mass_kg = weight_kg - lbm
    body_fat_pct = (fat_mass_kg / weight_kg) * 100

    # BMR (Mifflin-St Jeor equation)
    if gender == "male":
        bmr = 88.36 + (13.4 * weight_kg) + (4.8 * height_cm) - (5.7 * age)
    else:
        bmr = 447.6 + (9.2 * weight_kg) + (3.1 * height_cm) - (4.3 * age)

    # ... plus body water, muscle mass, bone mass, BMI

These formulas aren’t perfectly accurate - consumer BIA scales are notoriously inconsistent - but they’re good enough for tracking trends over time.

The Smart ETL Challenge

Here’s a problem I didn’t anticipate: the scale sends multiple packets per measurement. First you get weight-only packets (0x20), then eventually a complete packet with impedance (0x21). Sometimes the impedance packet arrives seconds after the weight.

I needed an ETL pipeline that could:

Group packets into “sessions” (measurements within 30 seconds of each other)
Pick the best packet from each session (prefer 0x21 over 0x20)
Detect which user profile the measurement belongs to (by weight range)
Calculate body composition only if the profile has complete parameters
Update existing measurements if better data arrives later

def group_into_sessions(packets: list[dict], gap_seconds: int = 30) -> list[list[dict]]:
    """Group packets into sessions based on time gaps."""
    if not packets:
        return []

    sessions = []
    current_session = [packets[0]]

    for packet in packets[1:]:
        prev_time = datetime.fromisoformat(current_session[-1]["timestamp"])
        curr_time = datetime.fromisoformat(packet["timestamp"])

        if curr_time - prev_time > timedelta(seconds=gap_seconds):
            sessions.append(current_session)
            current_session = [packet]
        else:
            current_session.append(packet)

    sessions.append(current_session)
    return sessions

The ETL also handles the case where you step on the scale without bare feet (weight-only mode). If impedance arrives later for an existing weight measurement, it updates the record rather than creating a duplicate.

The Dashboard

The web interface is a Flask app with Chart.js for visualization and HTMX for dynamic updates. It shows:

Latest measurement (weight, body fat %)
30-day weight trend chart
Measurement history table
Profile selector dropdown
Profile management modal

The Scalibur dashboard showing weight trends and body composition metrics

@app.route("/")
def index():
    """Render the dashboard."""
    run_etl()  # Process any new packets
    profiles = db.get_profiles()
    profile_id = request.args.get("profile", profiles[0]["id"] if profiles else None)
    latest = db.get_latest_measurement(profile_id=profile_id)
    recent = db.get_measurements(limit=10, profile_id=profile_id)
    return render_template("index.html", latest=latest, recent=recent,
                          profiles=profiles, selected_profile=profile_id)

The ETL runs on every page load, so new measurements appear immediately. All queries are filtered by the selected profile, so each household member sees only their own data.

Multi-User Support

After the initial version worked, I faced a new challenge: my household has multiple people who use the scale. The original design assumed a single user with hardcoded height/age/gender values. I needed a way to identify who was standing on the scale.

Why Not Use the Scale’s User ID?

The BLE packets include a user ID field (bytes 4-5). My first attempt used this to identify users - but it turned out to be unreliable. The scale’s internal user management didn’t align with how we actually used it, and the IDs would sometimes change unexpectedly.

Weight-Based Profile Detection

The solution was simpler: identify users by their weight range. Household members typically have non-overlapping weight ranges, so a measurement of 75kg obviously belongs to a different person than one of 55kg.

def detect_profile(weight_kg: float, profiles: list[dict]) -> dict | None:
    """Find profile where weight falls within min/max range."""
    for profile in profiles:
        min_w = profile.get("min_weight_kg")
        max_w = profile.get("max_weight_kg")
        if min_w is not None and max_w is not None and min_w <= weight_kg <= max_w:
            return profile
    return None

Each profile stores a weight range (min/max), plus the body composition parameters (height, age, gender). When a measurement comes in, the ETL matches it to the appropriate profile and calculates body composition using that profile’s parameters.

Profile Management

The dashboard includes an HTMX-powered modal for managing profiles. You can add, edit, or delete profiles without page reloads. When you update a profile’s height, age, or gender, all existing measurements for that profile are automatically recalculated with the new values:

@app.route("/api/profiles/<int:profile_id>", methods=["PUT"])
def update_profile(profile_id: int):
    """Update an existing profile."""
    # ... validation and save ...
    db.recalculate_profile_measurements(profile_id)
    return jsonify({"id": profile_id})

This means you can correct a profile’s parameters at any time and all historical body composition metrics will be updated accordingly.

Deployment

Deployment to a Raspberry Pi is a single command:

PI_USER=pi ./deploy.sh raspberrypi.local

This rsyncs the code, installs dependencies via uv, and sets up two systemd services:

scalibur-scanner.service - The BLE scanner daemon (needs CAP_NET_RAW for raw socket access)
scalibur-dashboard.service - The Flask dashboard on port 5000

SQLite with WAL mode handles concurrent access between the scanner writing packets and the dashboard reading measurements. The dashboard runs database migrations on startup, so existing installations are automatically upgraded when new schema changes are deployed (like the addition of the profiles table).

Lessons Learned

Low-level BLE is tricky but rewarding. Most BLE tutorials focus on GATT services and characteristics. Advertisement-based protocols are less common but simpler once you understand HCI packets.

Iterative debugging with real hardware is slow. I burned a lot of time stepping on and off the scale, waiting for packets, checking hex dumps. A good test suite with captured packets would have saved hours.

Owning your data is worth the effort. The scale’s original app is fine, but now I have a SQLite database I can query however I want. I can export to CSV, build custom visualizations, or integrate with other health tracking systems.

Weight-based identification is surprisingly robust. When I needed multi-user support, my first instinct was to use the scale’s built-in user ID field. That turned out to be unreliable. The simpler solution - identifying users by their weight range - works better in practice. Household members almost always have non-overlapping weight ranges, and the approach requires no coordination with the scale’s internal state.

The code is on GitHub if you want to try it with your own TY5108 scale or adapt it for similar devices. The packet parsing logic might work for other cheap BLE scales with minor tweaks - the protocol seems fairly common among generic Chinese body composition scales.

--- title: "Scalibur: Reading Body Composition from a Cheap Bluetooth Scale" subtitle: "A journey through BLE packet sniffing, protocol reverse-engineering, and Raspberry Pi deployment" description: "How I built a Raspberry Pi dashboard to capture and visualize body composition data from a GoodPharm TY5108 Bluetooth scale" author: "Guy Freeman" date: 2025-12-22 date-modified: 2025-12-31 categories: [python, iot, raspberry-pi, health, data, ble, hardware] image: og-image.png execute: eval: false echo: true --- Body composition scales are genuinely useful devices. Step on, wait a few seconds, and you get weight, body fat percentage, muscle mass, and a handful of other metrics. The problem? Your data disappears into whatever proprietary app the manufacturer built, usually with aggressive upsells and questionable privacy practices. I bought a cheap GoodPharm TY5108 scale (it advertises itself as "tzc" over Bluetooth LE) for around £20. It works well enough, but I wanted to own my data. So I built [Scalibur](https://github.com/gfrmin/scalibur): a Raspberry Pi-based system that captures the raw BLE advertisements, decodes the measurements, calculates body composition metrics, and displays everything on a simple web dashboard. ::: {.callout-note} For a less technical take on why I built this, see [On Owning Your Data](/posts/on-owning-your-data/). ::: This is the story of how I got there. ## The Hardware The setup is straightforward: - **GoodPharm TY5108 scale** - A cheap body composition scale with BLE. It measures weight using load cells and impedance through electrodes on the surface (you need to stand barefoot for the full reading). - **Raspberry Pi** - Any model with Bluetooth LE works. I used a Pi 4, but a Pi Zero W would be fine for this. The scale broadcasts its measurements as BLE advertisements. Unlike connected BLE devices that require pairing and a persistent connection, advertisement-based devices just shout their data into the void. Anyone listening can pick it up. ## The Journey: Reverse-Engineering the Protocol My first attempt used `bleak`, a popular Python BLE library. I could see the device advertising, but the high-level APIs abstracted away the manufacturer-specific data I needed. The scale wasn't exposing GATT services in the usual way - all the interesting data was in the advertisement packets themselves. I switched to `aioblescan`, which gives you raw HCI (Host Controller Interface) packets. This is the low-level interface between the Bluetooth controller and the host system. Here's where it gets interesting: ```python def parse_hci_packet(data: bytes) -> tuple[str | None, int | None, bytes | None]: """Parse raw HCI LE advertising packet.""" if len(data) < 14 or data[0:2] != b'\x04\x3e': return None, None, None if data[3] != 0x02: # Not LE Advertising Report return None, None, None adv_len = data[13] adv_data = data[14:14 + adv_len] device_name = None manufacturer_id = None manufacturer_data = None i = 0 while i < len(adv_data): if i + 1 >= len(adv_data): break length = adv_data[i] if length == 0 or i + length >= len(adv_data): break ad_type = adv_data[i + 1] ad_value = adv_data[i + 2:i + 1 + length] if ad_type == 0x09: # Complete Local Name device_name = ad_value.decode('utf-8') elif ad_type == 0xFF and len(ad_value) >= 2: # Manufacturer Specific Data manufacturer_id = int.from_bytes(ad_value[0:2], "little") manufacturer_data = ad_value[2:] i += 1 + length return device_name, manufacturer_id, manufacturer_data ``` The BLE advertising data structure is a series of TLV (Type-Length-Value) entries. We're looking for type `0xFF` - manufacturer-specific data - which contains the actual scale reading. ### The Debugging Saga Getting the byte offsets right took several iterations. My git history tells the story: ``` f477996 Fix packet decoding: weight in data bytes, not manufacturer_id 28de4a4 Fix packet decoding: weight is in manufacturer ID field ``` I initially thought the weight was encoded in the manufacturer ID field (the first two bytes of the manufacturer-specific data). Then I thought the opposite. Both were wrong in different ways. The actual layout, once I figured it out: | Bytes | Description | |-------|-------------| | 0-1 | Weight (big-endian, divide by 10 for kg) | | 2-3 | Impedance (big-endian, divide by 10 for ohms; 0 = not measured) | | 4-5 | User ID | | 6 | Status: 0x20 = weight only, 0x21 = weight + impedance | | 7+ | MAC address (ignored) | The status byte was crucial. The scale first broadcasts `0x20` when it has a stable weight reading, then `0x21` once it's also measured impedance (which takes a few more seconds and requires bare feet contact with the electrodes). ```python def decode_packet(manufacturer_id: int, manufacturer_data: bytes) -> ScaleReading | None: """Decode tzc scale advertisement packet.""" if len(manufacturer_data) < 7: return None weight_raw = int.from_bytes(manufacturer_data[0:2], "big") weight_kg = weight_raw / 10 impedance_raw = int.from_bytes(manufacturer_data[2:4], "big") impedance_ohm = impedance_raw / 10 if impedance_raw > 0 else None user_id = int.from_bytes(manufacturer_data[4:6], "big") status = manufacturer_data[6] is_complete = status == 0x21 or (status == 0x20 and impedance_raw == 0) return ScaleReading( weight_kg=weight_kg, impedance_raw=impedance_raw, impedance_ohm=impedance_ohm, user_id=user_id, is_complete=is_complete, is_locked=is_complete, ) ``` ## Body Composition Math Once you have weight and impedance, you can calculate body composition using BIA (Bioelectrical Impedance Analysis) formulas. These are well-documented - I used formulas compatible with [openScale](https://github.com/oliexdev/openScale), an open-source Android app for body composition scales. The key insight is that lean tissue conducts electricity better than fat. By measuring the body's impedance and knowing height/age/gender, you can estimate lean body mass: ```python def calculate_body_composition( weight_kg: float, impedance_ohm: float, height_cm: int, age: int, gender: str, ) -> BodyComposition: """Calculate body composition using standard BIA formulas.""" height_sq = height_cm**2 # Lean Body Mass if gender == "male": lbm = 0.485 * (height_sq / impedance_ohm) + 0.338 * weight_kg + 5.32 else: lbm = 0.474 * (height_sq / impedance_ohm) + 0.180 * weight_kg + 5.03 # Body Fat fat_mass_kg = weight_kg - lbm body_fat_pct = (fat_mass_kg / weight_kg) * 100 # BMR (Mifflin-St Jeor equation) if gender == "male": bmr = 88.36 + (13.4 * weight_kg) + (4.8 * height_cm) - (5.7 * age) else: bmr = 447.6 + (9.2 * weight_kg) + (3.1 * height_cm) - (4.3 * age) # ... plus body water, muscle mass, bone mass, BMI ``` These formulas aren't perfectly accurate - consumer BIA scales are notoriously inconsistent - but they're good enough for tracking trends over time. ## The Smart ETL Challenge Here's a problem I didn't anticipate: the scale sends multiple packets per measurement. First you get weight-only packets (`0x20`), then eventually a complete packet with impedance (`0x21`). Sometimes the impedance packet arrives seconds after the weight. I needed an ETL pipeline that could: 1. Group packets into "sessions" (measurements within 30 seconds of each other) 2. Pick the best packet from each session (prefer `0x21` over `0x20`) 3. Detect which user profile the measurement belongs to (by weight range) 4. Calculate body composition only if the profile has complete parameters 5. Update existing measurements if better data arrives later ```python def group_into_sessions(packets: list[dict], gap_seconds: int = 30) -> list[list[dict]]: """Group packets into sessions based on time gaps.""" if not packets: return [] sessions = [] current_session = [packets[0]] for packet in packets[1:]: prev_time = datetime.fromisoformat(current_session[-1]["timestamp"]) curr_time = datetime.fromisoformat(packet["timestamp"]) if curr_time - prev_time > timedelta(seconds=gap_seconds): sessions.append(current_session) current_session = [packet] else: current_session.append(packet) sessions.append(current_session) return sessions ``` The ETL also handles the case where you step on the scale without bare feet (weight-only mode). If impedance arrives later for an existing weight measurement, it updates the record rather than creating a duplicate. ## The Dashboard The web interface is a Flask app with Chart.js for visualization and HTMX for dynamic updates. It shows: - Latest measurement (weight, body fat %) - 30-day weight trend chart - Measurement history table - Profile selector dropdown - Profile management modal ![The Scalibur dashboard showing weight trends and body composition metrics](dashboard.png) ```python @app.route("/") def index(): """Render the dashboard.""" run_etl() # Process any new packets profiles = db.get_profiles() profile_id = request.args.get("profile", profiles[0]["id"] if profiles else None) latest = db.get_latest_measurement(profile_id=profile_id) recent = db.get_measurements(limit=10, profile_id=profile_id) return render_template("index.html", latest=latest, recent=recent, profiles=profiles, selected_profile=profile_id) ``` The ETL runs on every page load, so new measurements appear immediately. All queries are filtered by the selected profile, so each household member sees only their own data. ## Multi-User Support After the initial version worked, I faced a new challenge: my household has multiple people who use the scale. The original design assumed a single user with hardcoded height/age/gender values. I needed a way to identify who was standing on the scale. ### Why Not Use the Scale's User ID? The BLE packets include a user ID field (bytes 4-5). My first attempt used this to identify users - but it turned out to be unreliable. The scale's internal user management didn't align with how we actually used it, and the IDs would sometimes change unexpectedly. ### Weight-Based Profile Detection The solution was simpler: identify users by their weight range. Household members typically have non-overlapping weight ranges, so a measurement of 75kg obviously belongs to a different person than one of 55kg. ```python def detect_profile(weight_kg: float, profiles: list[dict]) -> dict | None: """Find profile where weight falls within min/max range.""" for profile in profiles: min_w = profile.get("min_weight_kg") max_w = profile.get("max_weight_kg") if min_w is not None and max_w is not None and min_w <= weight_kg <= max_w: return profile return None ``` Each profile stores a weight range (min/max), plus the body composition parameters (height, age, gender). When a measurement comes in, the ETL matches it to the appropriate profile and calculates body composition using that profile's parameters. ### Profile Management The dashboard includes an HTMX-powered modal for managing profiles. You can add, edit, or delete profiles without page reloads. When you update a profile's height, age, or gender, all existing measurements for that profile are automatically recalculated with the new values: ```python @app.route("/api/profiles/<int:profile_id>", methods=["PUT"]) def update_profile(profile_id: int): """Update an existing profile.""" # ... validation and save ... db.recalculate_profile_measurements(profile_id) return jsonify({"id": profile_id}) ``` This means you can correct a profile's parameters at any time and all historical body composition metrics will be updated accordingly. ## Deployment Deployment to a Raspberry Pi is a single command: ```bash PI_USER=pi ./deploy.sh raspberrypi.local ``` This rsyncs the code, installs dependencies via `uv`, and sets up two systemd services: - `scalibur-scanner.service` - The BLE scanner daemon (needs `CAP_NET_RAW` for raw socket access) - `scalibur-dashboard.service` - The Flask dashboard on port 5000 SQLite with WAL mode handles concurrent access between the scanner writing packets and the dashboard reading measurements. The dashboard runs database migrations on startup, so existing installations are automatically upgraded when new schema changes are deployed (like the addition of the profiles table). ## Lessons Learned **Low-level BLE is tricky but rewarding.** Most BLE tutorials focus on GATT services and characteristics. Advertisement-based protocols are less common but simpler once you understand HCI packets. **Iterative debugging with real hardware is slow.** I burned a lot of time stepping on and off the scale, waiting for packets, checking hex dumps. A good test suite with captured packets would have saved hours. **Owning your data is worth the effort.** The scale's original app is fine, but now I have a SQLite database I can query however I want. I can export to CSV, build custom visualizations, or integrate with other health tracking systems. **Weight-based identification is surprisingly robust.** When I needed multi-user support, my first instinct was to use the scale's built-in user ID field. That turned out to be unreliable. The simpler solution - identifying users by their weight range - works better in practice. Household members almost always have non-overlapping weight ranges, and the approach requires no coordination with the scale's internal state. The code is on [GitHub](https://github.com/gfrmin/scalibur) if you want to try it with your own TY5108 scale or adapt it for similar devices. The packet parsing logic might work for other cheap BLE scales with minor tweaks - the protocol seems fairly common among generic Chinese body composition scales.