Re: BUG: using smp_processor_id() in preemptible [00000000] code: TCPTSK/1809


Jan Kiszka
 

On 07.06.21 12:18, Rainer Kloud wrote:
Hi,
 
I notice you are using -rt kernel. Do you actually need realtime
features?
Yes, I actually need the realtime feature. I have one task which
needs to run periodically in realtime (triggered every 10ms by an
external IRQ).
Please don't forget to share your kernel config with us so that we can
make sure your use case is covered subsystem-wise in CIP. From Siemens
side, we still have room for improvements in this regard, even more on -rt.
Attached you can find my kernel config. 
 
Would you propose it as patch to
https://gitlab.com/cip-project/cip-kernel/cip-kernel-config?

Jun 1 09:11:46 sicam kernel: [46802.944299] [<c062f854>] (dump_stack) from [<c03a57ec>] (check_preemption_disabled+0x110/0x114)
Jun 1 09:11:46 sicam kernel: [46802.944316] [<c03a57ec>] (check_preemption_disabled) from [<c014163c>] (migrate_enable+0x40/0x488)
Jun 1 09:11:46 sicam kernel: [46802.944338] [<c014163c>] (migrate_enable) from [<c053ff0c>] (ip_finish_output2+0x21c/0x460)
Migration should be on across migration-disabled sections, that's their
whole purpose. But maybe the check that preemption needs to be off when
using smp_processor_id needs relaxing to at least migration must be off.
Sorry, but I can not follow your words. What do you mean with 'needs relaxing to
at least migration must be off'?
Strike it, that case is too generic to be a reason for something that
long in the field by now.

From looking at the code of migrate_enable(), I suspect that we cause
the call to smp_processor_id() via "struct rq *rq = this_rq();". That
comes fairly at the beginning of migrated_enable(), which matches the
small offset in your backtrace (you can confirm that better). That would
complain about "preemptible code" if the caller does not have preemption
or at least migration off. So my suspect would be a
migration_disable/enable imbalance in the code path you triggered,
likely somewhere in the TCP code.

Did you already try more recent RT kernels, both in the 4.19 series as
well as maybe 5.10 or even latest -rt? Possibly, this issue has been
fixed by now. If you can even reproduce with latest -rt, the issue is
better reported to linux-rt-users.

Jan

--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

Join cip-dev@lists.cip-project.org to automatically receive all group messages.