Project

General

Profile

Actions

Bug #6209

closed

modem: no response to ICMP Echo Req (ping to MS) with mssdr-ms

Added by fixeria 7 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
OsmocomBB modem (host)
Target version:
Start date:
10/06/2023
Due date:
% Done:

100%

Resolution:
Spec Reference:
Tags:

Description

I am trying to ping the MS from the BTS side, but seeing lots of errors and almost no Echo Rsp.

trxcon logs errors about GPRS UL BLOCK.req without prior TBF CFG.req:

20231005180900657 DSCH NOTICE trxcon(0)[0x5572f10900]: Reset scheduler  (sched_trx.c:190)
20231005180900657 DSCH NOTICE trxcon(0)[0x5572f10900]: Delete TDMA timeslot #0 (sched_trx.c:226)
20231005180900657 DSCH NOTICE trxcon(0)[0x5572f10900]: Add a new TDMA timeslot #3 (sched_trx.c:207)
20231005180900657 DSCH NOTICE trxcon(0)[0x5572f10900]: (Re)configure TDMA timeslot #3 as PDCH (sched_trx.c:276)
20231005180900657 DSCH NOTICE trxcon(0)[0x5572f10900]: TS3-PDTCH activating (sched_trx.c:476)
20231005180900657 DSCH NOTICE trxcon(0)[0x5572f10900]: TS3-PTCCH activating (sched_trx.c:476)
20231005180900843 DSCH NOTICE trxcon(0)[0x5572f10900]: Delete TDMA timeslot #3 (sched_trx.c:226)
20231005180900843 DGPRS ERROR trxcon(0)[0x5572f10900]: (PDCH-3) Rx UL BLOCK.req (fn=2003126, len=34), but this PDCH has no configured TBFs (l1gprs.c:643)
20231005180900871 DL1C NOTICE trxcon(0)[0x5572f10900]{PACKET_DATA}: Received reset request (FULL) (l1ctl.c:439)
20231005180900871 DSCH NOTICE trxcon(0)[0x5572f10900]: Reset scheduler and clock counter (sched_trx.c:190)
20231005180900872 DL1C NOTICE trxcon(0)[0x5572f10900]{RESET}: Received FBSB request (GSM900 979, timeout 100 TDMA FNs) (l1ctl.c:380)
20231005180900872 DSCH NOTICE trxcon(0)[0x5572f10900]: Add a new TDMA timeslot #0 (sched_trx.c:207)
20231005180900872 DSCH NOTICE trxcon(0)[0x5572f10900]: (Re)configure TDMA timeslot #0 as BCCH+CCCH+SDCCH/4+SACCH/4 (sched_trx.c:276)
20231005180900872 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-SCH activating (sched_trx.c:476)
20231005180900872 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-BCCH activating (sched_trx.c:476)
20231005180900872 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-RACH activating (sched_trx.c:476)
20231005180900872 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-CCCH activating (sched_trx.c:476)
20231005180900890 DAPP ERROR trxcon(0)[0x5572f10900]{FBSB_SEARCH}: Event TRXCON_EV_RX_DATA_IND not permitted (trxcon_shim.c:93)
20231005180901644 DL1C NOTICE trxcon(0)[0x5572f10900]{BCCH_CCCH}: Received 8-bit RACH request (offset=0, ra=0x78) (l1ctl.c:542)
20231005180901646 DSCHD NOTICE trxcon(0)[0x5572f10900]: TS0-RACH Scheduled 8-bit RACH (TS0: GSM, GMSK) at fn=2003300 (sched_lchan_rach.c:130)
20231005180901834 DL1C NOTICE trxcon(0)[0x5572f10900]{BCCH_CCCH}: Received L1CTL_DM_EST_REQ (tn=4, chan_nr=0xc4, tsc=7, tch_mode=SIGNALLING) (l1ctl.c:630)
20231005180901834 DL1C NOTICE trxcon(0)[0x5572f10900]{BCCH_CCCH}: L1CTL_DM_EST_REQ indicates single ARFCN GSM900 979 (l1ctl.c:572)
20231005180901834 DSCH NOTICE trxcon(0)[0x5572f10900]: Reset scheduler  (sched_trx.c:190)
20231005180901834 DSCH NOTICE trxcon(0)[0x5572f10900]: Delete TDMA timeslot #0 (sched_trx.c:226)
20231005180901834 DSCH NOTICE trxcon(0)[0x5572f10900]: Add a new TDMA timeslot #4 (sched_trx.c:207)
20231005180901834 DSCH NOTICE trxcon(0)[0x5572f10900]: (Re)configure TDMA timeslot #4 as PDCH (sched_trx.c:276)
20231005180901834 DSCH NOTICE trxcon(0)[0x5572f10900]: TS4-PDTCH activating (sched_trx.c:476)
20231005180901834 DSCH NOTICE trxcon(0)[0x5572f10900]: TS4-PTCCH activating (sched_trx.c:476)
20231005180902026 DSCH NOTICE trxcon(0)[0x5572f10900]: Delete TDMA timeslot #4 (sched_trx.c:226)
20231005180902026 DGPRS ERROR trxcon(0)[0x5572f10900]: (PDCH-4) Rx UL BLOCK.req (fn=2003382, len=34), but this PDCH has no configured TBFs (l1gprs.c:643)
20231005180902054 DL1C NOTICE trxcon(0)[0x5572f10900]{PACKET_DATA}: Received reset request (FULL) (l1ctl.c:439)
20231005180902054 DSCH NOTICE trxcon(0)[0x5572f10900]: Reset scheduler and clock counter (sched_trx.c:190)
20231005180902055 DL1C NOTICE trxcon(0)[0x5572f10900]{RESET}: Received FBSB request (GSM900 979, timeout 100 TDMA FNs) (l1ctl.c:380)
20231005180902055 DSCH NOTICE trxcon(0)[0x5572f10900]: Add a new TDMA timeslot #0 (sched_trx.c:207)
20231005180902055 DSCH NOTICE trxcon(0)[0x5572f10900]: (Re)configure TDMA timeslot #0 as BCCH+CCCH+SDCCH/4+SACCH/4 (sched_trx.c:276)
20231005180902055 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-SCH activating (sched_trx.c:476)
20231005180902055 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-BCCH activating (sched_trx.c:476)
20231005180902055 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-RACH activating (sched_trx.c:476)
20231005180902055 DSCH NOTICE trxcon(0)[0x5572f10900]: TS0-CCCH activating (sched_trx.c:476)
20231005180902511 DL1D NOTICE trxcon(0)[0x5572f10900]: L1CTL connection error: read() failed (rc=0): Resource temporarily unavailable (l1ctl_server.c:55)
20231005180902511 DL1C NOTICE trxcon(0)[0x5572f10900]: Closing L1CTL connection (l1ctl_server.c:203)
20231005180902511 DSCH NOTICE trxcon(0)[0x5572f10900]: Shutdown scheduler (sched_trx.c:173)
20231005180902511 DSCH NOTICE trxcon(0)[0x5572f10900]: Delete TDMA timeslot #0 (sched_trx.c:226)
<pre>

Please find the PCAPs (MS side + BTS side) attached.

Files

mssdr_gprs_ping_ms.pcap.gz mssdr_gprs_ping_ms.pcap.gz 269 KB fixeria, 10/05/2023 06:21 PM
mssdr_gprs_ping_bts.pcapng.gz mssdr_gprs_ping_bts.pcapng.gz 2.65 MB fixeria, 10/05/2023 06:21 PM
mssdr_gprs_ms_ping.pcap.gz mssdr_gprs_ms_ping.pcap.gz 1.11 MB fixeria, 10/13/2023 12:20 PM

Related issues

Related to OsmocomBB - Bug #6208: modem: "UL_TBF[0x55a5d78de0]{ASSIGN}: Event RX_UL_ACK_NACK not permitted"Resolvedfixeria10/06/2023

Actions
Actions #1

Updated by fixeria 7 months ago

  • Related to Bug #6208: modem: "UL_TBF[0x55a5d78de0]{ASSIGN}: Event RX_UL_ACK_NACK not permitted" added
Actions #2

Updated by pespin 7 months ago

fixeria the timestamp lifetime of the log from osmo-trx-ms doesn't match those in the pcaps :/

Actions #3

Updated by pespin 7 months ago

  • Status changed from New to Feedback
  • Assignee changed from pespin to fixeria

This looks like we are transmitting an UL block during free of a tbf and immediately after going back to CCCH.

It may be fixed by libosmo-gprs.git which was submitted earlier this week (df4f1cf0a3026b83177410008a8dd5794effe3ba).
In order to know I need matching pcaps with osmo-trx-ms logs, sorry.

Actions #4

Updated by fixeria 7 months ago

  • Status changed from Feedback to In Progress
Actions #5

Updated by fixeria 7 months ago

Hi Pau,

I did some more testing, and somehow I am not seeing these trxcon errors anymore:

20231005180900843 DGPRS ERROR trxcon(0)[0x5572f10900]: (PDCH-3) Rx UL BLOCK.req (fn=2003126, len=34), but this PDCH has no configured TBFs (l1gprs.c:643)

and furthermore I saw 3 ICMP Echo Resp made it back to the GGSN side. But after that the modem got stuck on PDCH endlessly receiving dummy blocks.

I recorded a PCAP on the MS side only, but this time it should contain everything:

  • osmo-trx-ms/trxcon logging,
  • layer23/modem Um frames and logging.

Here is a patch that makes layer23/modem send GPRS blocks over GSMTAP:

https://gerrit.osmocom.org/c/osmocom-bb/+/34733 layer23: send UL/DL GPRS blocks over GSMTAP

Assigning back to you. Let me know if anything is missing.

Actions #6

Updated by pespin 6 months ago

fixeria mssdr_gprs_ms_ping.pcap.gz starting from frame_nr 2554 seems to be buggy, submitting empty GSMTAP blocks:
- We should probably fix your https://gerrit.osmocom.org/c/osmocom-bb/+/34733 patch to avoid submitting empty blocks to gsmtap?
- We should fix wireshark to show empty GSMTAP payload as malformed.

Actions #7

Updated by pespin 6 months ago

fixeria, Hoernchen the empty gsmtap packets seems to be buggy RLC/MAC blocks received on the DL at the ms-sdr, see for instance:

2550    12:13:30.287279    Oct 13, 2023 14:13:30.287279000 CEST    127.0.0.1    59719    127.0.0.1    4729    GSMTAP    230    trxcon(0)[0x55b23abd20]: TS3-PDTCH Received bad frame (rc=-1, ber=444/444) at fn=496509 
2551    12:13:30.287485    Oct 13, 2023 14:13:30.287485000 CEST    127.0.0.1    58441    127.0.1.1    4729    GSMTAP    222    GPRS-RR(1)[0x557190c470]{PACKET_TRANSFER}: Received Event GRR_EV_PDCH_BLOCK_IND 
2552    12:13:30.287541    Oct 13, 2023 14:13:30.287541000 CEST    127.0.0.1    58441    127.0.1.1    4729    GSMTAP    191    Rx from lower layers: L1CTL-PDCH_DATA.indication 
2553    12:13:30.287584    Oct 13, 2023 14:13:30.287584000 CEST    127.0.0.1    58441    127.0.1.1    4729    GSMTAP    179    Dropping DL data block with length 0 
2554    12:13:30.287649    Oct 13, 2023 14:13:30.287649000 CEST    127.0.0.1    51492    127.0.0.1    4729    GSMTAP    58    [Malformed Packet]

So since DL blocks cannot be decoded, that means no USF can be decoded and hence "modem" cannot transmit anymore on the UL because it is not "scheduled" by any USF from the network.

Actions #8

Updated by pespin 6 months ago

Regarding the problem of MS staying for too long (or forever) in PDCH instead of going back to CCCH:

At the time the MS stops receiving DL blocks, it has a UL and a DL TBF assigned. The timers below would be the responsibles in this scenario to move the MS back to CCCH:
-UL TBF:
T3180 "Wait for Uplink State Flag After Data Block"
When transmitting an RLC/MAC block to the
network an instance of T3180 is started for
the TBF for which the block was intended. 5s.

-DL TBF:
T3190 "Wait for Valid Downlink Data Received from the Network". 5s.

So both are 5s. Last time the timers would be armed:

2395    12:13:30.084197    Oct 13, 2023 14:13:30.084197000 CEST    127.0.0.1    51492    127.0.0.1    4729    GSM RLC/MAC    98    GPRS DL DATA: CS3 TFI=0 BSN=15 USF=7 Len=36
2549    12:13:30.265814    Oct 13, 2023 14:13:30.265814000 CEST    127.0.0.1    51492    127.0.0.1    4729    GSM RLC/MAC    92    GPRS UL DATA: CS2 TFI=0 BSN=0 CV=3 Len=30

T3190 (DL TBF) is actually working, as can be seen here:

5240      12:13:35.084188      Oct 13, 2023 14:13:35.084188000 CEST      127.0.0.1      58441      127.0.1.1      4729      GSMTAP      186      TBF(DL:NR-2:TLLI-f0085356) Timeout of T3190

However, T3180 is not yet implemented on the UL TBF side, and hence even if T3190 triggers on the DL TBF side and frees it, the MS still keeps the UL TBF established and as a result it stays in PDCH.
Implementing T3180 should fix that problem, I'll do it now.

Actions #9

Updated by pespin 6 months ago

Find here some timer & test fixes, plus a patch implementing missing T3180:
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/34770 rlcmac: Avoid re-arming T3182 in TBF Acknowledged mode [NEW]
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/34771 rlcmac: rlcmac: tests: Account for T3180 existance in test_ul_tbf_t3182_timeout [NEW]
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/34772 rlcmac: tests: Account for T3180 existance in test_ul_tbf_t3166_timeout [NEW]
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/34773 rlcmac: ul_tbf: Implement T3180 [NEW]

After those are merge, I'd expect the MS to go to CCCH after 5 seconds given the current situation with lower layers.

Actions #10

Updated by fixeria 6 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 0 to 100

AFAICS, all patches have been merged. MT ping works. Marking as resolved.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)