[PATCH 01/10] watchdog: Introduce hardware maximum heartbeat in watchdog core
Maxim Yu, Osipov
From: Guenter Roeck <linux@...>
commit 664a39236e718f9f03fa73fc01006da9ced04efc upstream. Introduce an optional hardware maximum heartbeat in the watchdog core. The hardware maximum heartbeat can be lower than the maximum timeout. Drivers can set the maximum hardware heartbeat value in the watchdog data structure. If the configured timeout exceeds the maximum hardware heartbeat, the watchdog core enables a timer function to assist sending keepalive requests to the watchdog driver. Signed-off-by: Guenter Roeck <linux@...> Signed-off-by: Wim Van Sebroeck <wim@...> [mosipov@... backported to 4.4.y] Signed-off-by: Maxim Yu. Osipov <mosipov@...> --- Documentation/watchdog/watchdog-kernel-api.txt | 19 +++- drivers/watchdog/watchdog_dev.c | 122 +++++++++++++++++++++++-- include/linux/watchdog.h | 30 ++++-- 3 files changed, 154 insertions(+), 17 deletions(-) diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt index d8b0d3367706..9887fa6d8f68 100644 --- a/Documentation/watchdog/watchdog-kernel-api.txt +++ b/Documentation/watchdog/watchdog-kernel-api.txt @@ -53,6 +53,7 @@ struct watchdog_device { unsigned int timeout; unsigned int min_timeout; unsigned int max_timeout; + unsigned int max_hw_heartbeat_ms; void *driver_data; struct mutex lock; unsigned long status; @@ -73,8 +74,18 @@ It contains following fields: additional information about the watchdog timer itself. (Like it's unique name) * ops: a pointer to the list of watchdog operations that the watchdog supports. * timeout: the watchdog timer's timeout value (in seconds). + This is the time after which the system will reboot if user space does + not send a heartbeat request if WDOG_ACTIVE is set. * min_timeout: the watchdog timer's minimum timeout value (in seconds). -* max_timeout: the watchdog timer's maximum timeout value (in seconds). + If set, the minimum configurable value for 'timeout'. +* max_timeout: the watchdog timer's maximum timeout value (in seconds), + as seen from userspace. If set, the maximum configurable value for + 'timeout'. Not used if max_hw_heartbeat_ms is non-zero. +* max_hw_heartbeat_ms: Maximum hardware heartbeat, in milli-seconds. + If set, the infrastructure will send heartbeats to the watchdog driver + if 'timeout' is larger than max_hw_heartbeat_ms, unless WDOG_ACTIVE + is set and userspace failed to send a heartbeat for at least 'timeout' + seconds. * bootstatus: status of the device after booting (reported with watchdog WDIOF_* status bits). * driver_data: a pointer to the drivers private data of a watchdog device. @@ -160,7 +171,11 @@ they are supported. These optional routines/operations are: and -EIO for "could not write value to the watchdog". On success this routine should set the timeout value of the watchdog_device to the achieved timeout value (which may be different from the requested one - because the watchdog does not necessarily has a 1 second resolution). + because the watchdog does not necessarily have a 1 second resolution). + Drivers implementing max_hw_heartbeat_ms set the hardware watchdog heartbeat + to the minimum of timeout and max_hw_heartbeat_ms. Those drivers set the + timeout value of the watchdog_device either to the requested timeout value + (if it is larger than max_hw_heartbeat_ms), or to the achieved timeout value. (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the watchdog's info structure). * get_timeleft: this routines returns the time that's left before a reset. diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 56a649e66eb2..df43c586e53f 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -35,9 +35,11 @@ #include <linux/module.h> /* For module stuff/... */ #include <linux/types.h> /* For standard types (like size_t) */ #include <linux/errno.h> /* For the -ENODEV/... values */ +#include <linux/jiffies.h> /* For timeout functions */ #include <linux/kernel.h> /* For printk/panic/... */ #include <linux/fs.h> /* For file operations */ #include <linux/watchdog.h> /* For watchdog specific items */ +#include <linux/workqueue.h> /* For workqueue */ #include <linux/miscdevice.h> /* For handling misc devices */ #include <linux/init.h> /* For __init/__exit/... */ #include <linux/uaccess.h> /* For copy_to_user/put_user/... */ @@ -49,6 +51,73 @@ static dev_t watchdog_devt; /* the watchdog device behind /dev/watchdog */ static struct watchdog_device *old_wdd; +static struct workqueue_struct *watchdog_wq; + +static inline bool watchdog_need_worker(struct watchdog_device *wdd) +{ + /* All variables in milli-seconds */ + unsigned int hm = wdd->max_hw_heartbeat_ms; + unsigned int t = wdd->timeout * 1000; + + /* + * A worker to generate heartbeat requests is needed if all of the + * following conditions are true. + * - Userspace activated the watchdog. + * - The driver provided a value for the maximum hardware timeout, and + * thus is aware that the framework supports generating heartbeat + * requests. + * - Userspace requests a longer timeout than the hardware can handle. + */ + return watchdog_active(wdd) && hm && t > hm; +} + +static long watchdog_next_keepalive(struct watchdog_device *wdd) +{ + unsigned int timeout_ms = wdd->timeout * 1000; + unsigned long keepalive_interval; + unsigned long last_heartbeat; + unsigned long virt_timeout; + unsigned int hw_heartbeat_ms; + + virt_timeout = wdd->last_keepalive + msecs_to_jiffies(timeout_ms); + hw_heartbeat_ms = min(timeout_ms, wdd->max_hw_heartbeat_ms); + keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2); + + /* + * To ensure that the watchdog times out wdd->timeout seconds + * after the most recent ping from userspace, the last + * worker ping has to come in hw_heartbeat_ms before this timeout. + */ + last_heartbeat = virt_timeout - msecs_to_jiffies(hw_heartbeat_ms); + return min_t(long, last_heartbeat - jiffies, keepalive_interval); +} + +static inline void watchdog_update_worker(struct watchdog_device *wdd) +{ + if (watchdog_need_worker(wdd)) { + long t = watchdog_next_keepalive(wdd); + + if (t > 0) + mod_delayed_work(watchdog_wq, &wdd->work, t); + } else { + cancel_delayed_work(&wdd->work); + } +} + +static int __watchdog_ping(struct watchdog_device *wdd) +{ + int err; + + if (wdd->ops->ping) + err = wdd->ops->ping(wdd); /* ping the watchdog */ + else + err = wdd->ops->start(wdd); /* restart watchdog */ + + watchdog_update_worker(wdd); + + return err; +} + /* * watchdog_ping: ping the watchdog. * @wdd: the watchdog device to ping @@ -73,16 +142,27 @@ static int watchdog_ping(struct watchdog_device *wdd) if (!watchdog_active(wdd)) goto out_ping; - if (wdd->ops->ping) - err = wdd->ops->ping(wdd); /* ping the watchdog */ - else - err = wdd->ops->start(wdd); /* restart watchdog */ + wdd->last_keepalive = jiffies; + err = __watchdog_ping(wdd); out_ping: mutex_unlock(&wdd->lock); return err; } +static void watchdog_ping_work(struct work_struct *work) +{ + struct watchdog_device *wdd; + + wdd = container_of(to_delayed_work(work), struct watchdog_device, + work); + + mutex_lock(&wdd->lock); + if (wdd && watchdog_active(wdd)) + __watchdog_ping(wdd); + mutex_unlock(&wdd->lock); +} + /* * watchdog_start: wrapper to start the watchdog. * @wdd: the watchdog device to start @@ -95,6 +175,7 @@ out_ping: static int watchdog_start(struct watchdog_device *wdd) { int err = 0; + unsigned long started_at; mutex_lock(&wdd->lock); @@ -106,9 +187,13 @@ static int watchdog_start(struct watchdog_device *wdd) if (watchdog_active(wdd)) goto out_start; + started_at = jiffies; err = wdd->ops->start(wdd); - if (err == 0) + if (err == 0) { set_bit(WDOG_ACTIVE, &wdd->status); + wdd->last_keepalive = started_at; + watchdog_update_worker(wdd); + } out_start: mutex_unlock(&wdd->lock); @@ -146,8 +231,10 @@ static int watchdog_stop(struct watchdog_device *wdd) } err = wdd->ops->stop(wdd); - if (err == 0) + if (err == 0) { clear_bit(WDOG_ACTIVE, &wdd->status); + cancel_delayed_work(&wdd->work); + } out_stop: mutex_unlock(&wdd->lock); @@ -211,6 +298,8 @@ static int watchdog_set_timeout(struct watchdog_device *wdd, err = wdd->ops->set_timeout(wdd, timeout); + watchdog_update_worker(wdd); + out_timeout: mutex_unlock(&wdd->lock); return err; @@ -490,6 +579,8 @@ static int watchdog_release(struct inode *inode, struct file *file) /* Allow the owner module to be unloaded again */ module_put(wdd->ops->owner); + cancel_delayed_work_sync(&wdd->work); + /* make sure that /dev/watchdog can be re-opened */ clear_bit(WDOG_DEV_OPEN, &wdd->status); @@ -527,6 +618,11 @@ int watchdog_dev_register(struct watchdog_device *wdd) { int err, devno; + if (!watchdog_wq) + return -ENODEV; + + INIT_DELAYED_WORK(&wdd->work, watchdog_ping_work); + if (wdd->id == 0) { old_wdd = wdd; watchdog_miscdev.parent = wdd->parent; @@ -573,6 +669,8 @@ int watchdog_dev_unregister(struct watchdog_device *wdd) set_bit(WDOG_UNREGISTERED, &wdd->status); mutex_unlock(&wdd->lock); + cancel_delayed_work_sync(&wdd->work); + cdev_del(&wdd->cdev); if (wdd->id == 0) { misc_deregister(&watchdog_miscdev); @@ -589,7 +687,16 @@ int watchdog_dev_unregister(struct watchdog_device *wdd) int __init watchdog_dev_init(void) { - int err = alloc_chrdev_region(&watchdog_devt, 0, MAX_DOGS, "watchdog"); + int err; + + watchdog_wq = alloc_workqueue("watchdogd", + WQ_HIGHPRI | WQ_MEM_RECLAIM, 0); + if (!watchdog_wq) { + pr_err("Failed to create watchdog workqueue\n"); + return -ENOMEM; + } + + err = alloc_chrdev_region(&watchdog_devt, 0, MAX_DOGS, "watchdog"); if (err < 0) pr_err("watchdog: unable to allocate char dev region\n"); return err; @@ -604,4 +711,5 @@ int __init watchdog_dev_init(void) void __exit watchdog_dev_exit(void) { unregister_chrdev_region(watchdog_devt, MAX_DOGS); + destroy_workqueue(watchdog_wq); } diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h index 027b1f43f12d..26aba9b17ac3 100644 --- a/include/linux/watchdog.h +++ b/include/linux/watchdog.h @@ -10,8 +10,9 @@ #include <linux/bitops.h> -#include <linux/device.h> #include <linux/cdev.h> +#include <linux/device.h> +#include <linux/kernel.h> #include <uapi/linux/watchdog.h> struct watchdog_ops; @@ -61,12 +62,17 @@ struct watchdog_ops { * @bootstatus: Status of the watchdog device at boot. * @timeout: The watchdog devices timeout value (in seconds). * @min_timeout:The watchdog devices minimum timeout value (in seconds). - * @max_timeout:The watchdog devices maximum timeout value (in seconds). + * @max_timeout:The watchdog devices maximum timeout value (in seconds) + * as configurable from user space. Only relevant if + * max_hw_heartbeat_ms is not provided. + * @max_hw_heartbeat_ms: + * Hardware limit for maximum timeout, in milli-seconds. + * Replaces max_timeout if specified. * @driver-data:Pointer to the drivers private data. * @lock: Lock for watchdog core internal use only. * @status: Field that contains the devices internal status bits. - * @deferred: entry in wtd_deferred_reg_list which is used to - * register early initialized watchdogs. + * @deferred: Entry in wtd_deferred_reg_list which is used to + * register early initialized watchdogs. * * The watchdog_device structure contains all information about a * watchdog timer device. @@ -88,8 +94,11 @@ struct watchdog_device { unsigned int timeout; unsigned int min_timeout; unsigned int max_timeout; + unsigned int max_hw_heartbeat_ms; void *driver_data; struct mutex lock; + unsigned long last_keepalive; + struct delayed_work work; unsigned long status; /* Bit numbers for status flags */ #define WDOG_ACTIVE 0 /* Is the watchdog running/active */ @@ -121,13 +130,18 @@ static inline bool watchdog_timeout_invalid(struct watchdog_device *wdd, unsigne { /* * The timeout is invalid if + * - the requested value is larger than UINT_MAX / 1000 + * (since internal calculations are done in milli-seconds), + * or * - the requested value is smaller than the configured minimum timeout, * or - * - a maximum timeout is configured, and the requested value is larger - * than the maximum timeout. + * - a maximum hardware timeout is not configured, a maximum timeout + * is configured, and the requested value is larger than the + * configured maximum timeout. */ - return t < wdd->min_timeout || - (wdd->max_timeout && t > wdd->max_timeout); + return t > UINT_MAX / 1000 || t < wdd->min_timeout || + (!wdd->max_hw_heartbeat_ms && wdd->max_timeout && + t > wdd->max_timeout); } /* Use the following functions to manipulate watchdog driver specific data */ -- 2.11.0
|
|
[PATCH 00/10] Backport of watchdog core triggered keepalive infrastructure v2
Maxim Yu, Osipov
Hello,
This is updated series after Ben's review. * On 10/25/17 12:43, Ben Hutchings wrote: On Wed, 2017-10-04 at 16:40 +0200, Maxim Yu. Osipov wrote:* On 10/25/17 12:46, Ben Hutchings wrote:From: Guenter Roeck <linux@...>[...] Please use one of these formats for upstream references in the commit<<<<<<<<<<<<< This series contains * backport of patches of watchdog core infrastructure supporting handling watchdog keepalive. * imx21_wdt converted to use infrastructure triggered keepalives. * backported support of WATCHDOG_HANDLE_BOOT_ENABLED option On some systems it's desirable to have watchdog reboot the system when it does not come up fast enough. This adds a kernel parameter to disable the auto-update of watchdog before userspace takes over and a kernel option to set the default. One of the important use cases is safe system update. In that case, the bootloader enables the watchdog, and Linux should only take it over at the point the whole Linux system is sure to be able to have a fully working system again, including userspace services that could take the next update. If we start the trigger the watchdog too early, like it was so far, we risk to enter a system state where the kernel works but not all other services we need. i.MX6 based board was used as a test platform. We suppose that watchdog is enabled in bootloader and set to 120 seconds (maximum timeout supported by i.MX watchdog timer) After applying these patches, built-in i.MX watchdog (imx21_wdt) into kernel and perform the following test cases: 1) option WATCHDOG_HANDLE_BOOT_ENABLED is turned off in kernel configuration w/o userspace application triggering the watchdog. Make sure that no userspace application is triggering watchdog. After around 120 seconds (timeout set in bootloader) board will reboot. 2) option WATCHDOG_HANDLE_BOOT_ENABLED is turned off in kernel configuration but userspace application is triggering watchdog more frequently that watchdog's timeout (by default set to 60 seconds for imx21_wdt). Watchdog will not reboot the board until the application re-arms the watchdog. 3) option WATCHDOG_HANDLE_BOOT_ENABLED is turned on in kernel configuration w/o userspace application triggering the watchdog. Make sure that no userspace application is triggering watchdog. Board will not reboot (watchdog keepalive is supported by watchdog core infrastructure, not by driver itself) Kind regards, Maxim. Guenter Roeck (6): watchdog: Introduce hardware maximum heartbeat in watchdog core watchdog: Introduce WDOG_HW_RUNNING flag watchdog: Make stop function optional watchdog: imx2: Convert to use infrastructure triggered keepalives watchdog: core: Fix circular locking dependency watchdog: core: Clear WDOG_HW_RUNNING before calling the stop function Pratyush Anand (1): watchdog: skip min and max timeout validity check when max_hw_heartbeat_ms is defined Rasmus Villemoes (1): watchdog: change watchdog_need_worker logic Sebastian Reichel (1): watchdog: core: add option to avoid early handling of watchdog Wei Yongjun (1): watchdog: core: Fix error handling of watchdog_dev_init() Documentation/watchdog/watchdog-kernel-api.txt | 51 +++++-- drivers/watchdog/Kconfig | 11 ++ drivers/watchdog/imx2_wdt.c | 74 ++-------- drivers/watchdog/watchdog_core.c | 4 +- drivers/watchdog/watchdog_dev.c | 196 +++++++++++++++++++++++-- include/linux/watchdog.h | 40 ++++- 6 files changed, 277 insertions(+), 99 deletions(-) -- 2.11.0
|
|
Re: Status of cip-core
Binh Thanh. Nguyen <binh.nguyen.uw@...>
Hi Chris, Daniel,
Subject: RE: [cip-dev] Status of cip-corev4.4.55-cip3 was fully tested, including UT,IT,ST. v4.4.69-cip4 , v4.4.75-cip6 also worked with just some simple tests. Best regards, Binh Nguyen
|
|
Re: Kselftest use-cases - Shuah Khan
Daniel Sangorrin <daniel.sangorrin@...>
Hi Ben,
toggle quoted messageShow quoted text
# I added the fuego mailing list to Cc Thanks for the notes!
-----Original Message-----It also works on Fuego now! # Thanks Tim for integrating my patches. How to run tests:In Fuego we use a different approach. First we cross-compile and install the tests on a temporary folder. At this stage a script called "run_kselftest.sh" is generated. Then, we deploy the binaries and the script to the target where the "run_kselftest.sh" script is invoked. The good part of this approach is that Fuego does not require the target board to have a toolchain installed, kernel source on the target, etc. Set `O=dir` for an out-of-tree build. (But cureently thisActually I think that this was proposed by Tim. There is a TAP plugin for Jenkins that can be used for parsing the results in Fuego. However, currently "run_kselftest.sh" doesn't seem to use the TAP format. humm.. Maybe this is on the TODO list upstream, I need to investigate it further. The output of individual tests can be found in `/tmp` (currently),In Fuego there is now full control for specifying which test cases are allowed to fail and which not. I will enable that functionality on Fuego's integration scripts. Some tests apparently check for dependencies in a kernel config file.On some kselftest folders there is a "config" file that specifies the kernel configuration options that need to be enabled (or disabled). From what I see there is not a general script to check that they are configured on the target kernel. Fuego does support checking kernel configuration before running the test (using information from /proc/config.gz or /boot/config-`uname -r`). Maybe it would a good idea to add support on Fuego for checking individual kselftest's config files Thank, Daniel Tips and hints:
|
|
Re: Status of cip-core
Chris Paterson
Hello Daniel,
Apologies for the slow response. Lots of ramblings below, happy to set up a call if it's easier to go through the details. From: Daniel Sangorrin [mailto:daniel.sangorrin@...]Let me explain... The renesas-rz github website is used to host the complete Yocto BSP package for the RZ/G iWave platforms. This is used as a basis for the "Renesas RZ/G Linux Platform" which was launched in October. This BSP is based on the CIP Kernel, but as you can see has a lot of patches applied on top. The main reason for these additional patches is that the iwg20m platform is not yet fully supported upstream. If using the above BSP for your testing, I'd recommend that you stick with the v4.4.55-cip3 branch. This has been fully verified and should be working. Whilst Renesas are continuously working on the renesas-rz github repository, in tandem we are also upstreaming support to the mainline Kernel for the iwg20m platform. Periodically once a new Kernel release has been made we are backporting support to the official CIP Kernel. The latest CIP Kernel (v4.4.92-cip11) includes support for clk, pinctrl, GPIO, Ethernet and eMMC on the iwg20m platform. Currently the official CIP Kernel does not support SD on the iwg20m. Support upstream will be included in Kernel v4.15. We plan to backport support once v4.15 is released, but can backport when v4.15-rc1 is out if there is an urgent need. Yes. Historically shmobile_defconfig doesn't have ext3/4 enabled upstream :( I guess one option would be to enable this on the CIP Kernel? I decided to use renesas-cip'sThere may well be other differences between the config used in the renesas-rz BSP compared to the upstream/CIP version. It might be best to stick with the version in the CIP Kernel and just enable EXT3/4 as required. This is what we use to test when backporting to the CIP Kernel. Again, due to lack of SD support in the official Kernel (see r8a7743-iwg20d-q7.dts). # By the way, I noticed that the mmc numbering also changed (mmcblk0p2As you have no need for the entire BSP (you only need the Kernel), I'd recommend using the official CIP Kernel. For now this would mean that you'd need to either use NFS or eMMC for your RFS. If using eMMC you'll need to enable EXT3/4 on top of shmobile_defconfig. This has the added benefit that CIP Core will use and test the official CIP Kernel, rather than Renesas' out of tree BSP version. I noticed that the new versions are been merged. I was using renesas-cip'sv4.4.55-cip3 is the only version properly tested. I think the other versions are just rebases. Binh-san, could you confirm the current status of the newer branches on renesas-rz? Kind regards, Chris
|
|
Re: Interesting talks at OSSE/Kernel Summit
Chris Paterson
From: cip-dev-bounces@... [mailto:cip-dev-+1
|
|
Re: Interesting talks at OSSE/Kernel Summit
Jan Kiszka
On 2017-11-07 18:40, Ben Hutchings wrote:
I attended several talks at OSSE and Kernel Summit in Prague that mightThanks for the valuable summaries, Ben!
|
|
Re: B@D run on Renesas board - issue
Binh Thanh. Nguyen <binh.nguyen.uw@...>
Hello Daniel, Robert,
Subject: RE: B@D run on Renesas board - issueThank you for your feedback. We are now purchasing the PDU. Best regards, Binh Nguyen
|
|
Kselftest use-cases - Shuah Khan
Ben Hutchings <ben.hutchings@...>
## Kselftest use-cases - Shuah Khan
[Description](https://osseu17.sched.com/event/CnFp/) kselftest is the kernel developer regression test suite. It is written by kernel developers and users. It is used by developers, users, automated test infrastructure (kernel-ci.org, 0-day test robot). How to run tests: * `make --silent kselftest` - run all default targets (`TARGETS` in `tools/testing/selftests/Makefile`). * `make --silent TARGETS=timers kselftest` - run all non-destructive tests in `tools/testing/selftests/timers` * `make --silent -C tools/testing/selftests/timers run_tests` - same Set `O=dir` for an out-of-tree build. (But cureently this may require a `.config` file in the source directory.) Set `quicktest=1` to exclude time-consuming tests. kselftest outputs a summary of results (since 4.14) following TAP (Test Anything Protocol). The output of individual tests can be found in `/tmp` (currently), but it should be changed to allow specifying directory. It is possible to run latest selftests on older kernels, but there will be some failures due to missing features. Ideally missing dependencies result in a "skip" result but some maintainers aren't happy to support that. One reason is that if a feature is broken so badly it isn't detected, tests may be skipped rather than failed. Some tests apparently check for dependencies in a kernel config file. (It wasn't clear to me where they look for it.) Tips and hints: * Use the `--silent` option to suppress make output * Some tests need to run as root * Beware that some tests are disruptive More information: * [Documentation/dev-tools/kselftest.rst](https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html) * [Blog entries](https://blogs.s-osg.org/author/shuahkh/) -- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Improve Regression Tracking - Thorsten Leemhuis
Ben Hutchings <ben.hutchings@...>
## Improve Regression Tracking - Thorsten Leemhuis
[Description](https://osseu17.sched.com/event/CnFn/) Thorsten has started tracking regressions in the kernel, in his spare time. He is not a programmer but a journalist (for c't) and community manager. He started doing this because Linus said he would like to see someone do the job which Rafael Wysocki used to. He expected it to be hard and frustrating, and it's even worse than that! He needs to search through LKML, subsystem MLs, multiple Bugzillas. It's very time consuming and he's still missing a lot of regressions. Discussion of a single issue might change forum and it's not obvious so he doesn't see that. An issue might get quietly resolved by commit without any message to a bug tracker. He requested people to use a regression ID (which he assigns) and put it in discussions and commit messages. This hasn't happened (yet). Someone suggested to make wider use of `[REGRESSION]` for reports of recent regressions. (Both of the above should be added to in-tree documentation.) Someone suggested to add another mailing list specifically for regression reports, that may be cc'd(?) along with specific ML. The upstream bug reporting process differs a lot across subsystems - frustrating for distribution maintainers forwarding reports. It is also hard to see current regression status of the next stable release when considering whether to update the kernel package. Regression tracking is also needed for the "long tail" of bugs that don't get reported so quickly (within 1 or 2 release cycles). This will require a team of people, not just one. There needs to be some kind of database to collect information, if only references to discussions elsewhere. Rafael used to create tracking bugs on <https://bugzilla.kernel.org>. Thorsten is using a spreadsheet. -- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Detecting Performance Regressions in the Linux Kernel - Jan Kara
Ben Hutchings <ben.hutchings@...>
## Detecting Performance Regressions in the Linux Kernel - Jan Kara
[Description](https://osseu17.sched.com/event/BxIY/) SUSE runs performance tests on a "grid" of different machines (10 x86, 1 ARM). The x86 machines have a wide range of CPUs, memory size, storage performance. There are two back-to-back connected pairs for network tests. Other instances of the same models are available for debugging. ### Software used "Marvin" is their framework for deploying, scheduling tests, bisecting. "MMTests" is a framework for benchmarks - parses results and generates comparisons - <https://github.com/gormanm/mmtests>. CPU benchmarks: hackbench. libmicro, kernel page alloc benchmark (with special module), PFT, SPECcpu2016, and others, IO benchmarks: Iozone, Bonnie, Postmark, Reaim, Dbench4. These are run for all supported filesystems (ext3, ext4, xfs, btrfs) and different RAID and non-RAID configurations. Network benchmarks: sockperf, netperf, netpipe, siege. These are run over loopback and 10 gigabit Ethernet using Unix domain sockets (where applicable), TCP, and UDP. siege doesn't scale well so will be replaced. Complex benchmarks: kernbench, SPECjvm, Pgebcnh, sqlite insertion, Postgres & MariaDB OLTP, ... ### How to detect performance changes? Comparing a single benchmark result from each version is no good - there is often significant variance in results. It is necessary to take multiple measurements, calculate average and s.d. Caches and other features for increasing performance involve prediction, which creates strong statistical dependencies. Some statistical tests assume samples come from a normal distribution, but performance results often don't. It is sometimes possible to use Welch's T-test for significance of a difference, but it is often necessary to plot a graph to understand how the performance distribution is different - it can be due to small numbers of outliers. Some benchmarks take multiple (but not enough) results and average them internally. Ideally a benchmark framework will get all the results and do its own statistical analysis. For this reason, MMTests uses modified versions of some benchmarks. ### Reducing variance in benchmarks Filesystems: create from scratch each time Scheduling: bind tasks to specific NUMA nodes; disable background services; reboot before starting It's generally not possible to control memory layout (which affects cache performance) or interrupt timing. ### Benchmarks are buggy * Setup can take most of the time * Averages are not always calculated correctly * Output is sometimes not flushed at exit, causing it to be truncated -- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Automating Open Source License Compliance - Kate Stewart
Ben Hutchings <ben.hutchings@...>
## Automating Open Source License Compliance - Kate Stewart
[Description](https://osseu17.sched.com/event/BxI3/) A product distribution may need to include copyright statements, licence statements, disclaimers. Why is this a problem? Open source projects are changing quickly, bringing in more copyrights and licenses. Sometimes a project's license doesn't actually apply to every source file. Different sponsoring organisations and distributions have different policies for which licenses they accept and how they're recorded. Language specific package repositories also have their own standards for recording licenses. Wish lists for automation: * Developers want a simple concise way to mark source files with license, checked at build time * Open Source Offices want accurate license summary for third party software, and their own. She referred to "Trusted Upstream Processes" but didn't expand on that. SPDX (Software Package Data eXchange) provides standard ways to identify licenses, to tag source files and to summarise this information in a manifest. Common OSS licenses are assigned short names; custom licenses are also supported. SPDX license identifiers supported by Debian (DEP-5), and other distributions and organisations considering adopting them. U-Boot uses them, Linux is starting to use them, Eclipse Package Manager(?) will start soon. "Open Government Partnership" created a best practices template that includes use of SPDX license identifiers. An SPDX v2.1 document has metadata for itself, each package covered, each source file included in packages, and any "snippets" (parts of a source file) with a different license. Various tooling is needed and available. Open source tools include FOSSology, ScanCode, SPDXTools; more are listed at <https://wiki.debian.org/CopyrightReviewTools>. Proprietary tools are available from Wind River, Black Duck, others. How accurate are the scanning tools? All are based partly on heuristics. She recomends testing them against a known set of source files. She mentioned some other types of tools, but I wasn't clear on what they do. The OpenChain project documents the process of license compliance(?), which is useful for a supply chain. Missing pieces: * Trusted SPDX importers (for reviewed documents?) * CI/CD build tool integration (check for tags at build time?) * Curated real world examples * End-to-end case studies using SPDX documents -- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Interesting talks at OSSE/Kernel Summit
Ben Hutchings <ben.hutchings@...>
I attended several talks at OSSE and Kernel Summit in Prague that might
be interesting to CIP members. These weren't recorded but I made some notes on them. I'll send my notes on each talk as a reply to this message. Ben. -- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Re: Next CIP kernel: CIP - Debian alignment
Ben Hutchings <ben.hutchings@...>
On Tue, 2017-11-07 at 17:41 +0100, Agustin Benito Bethencourt wrote:
Hi Ben,I take it that you mean the kernel for the next Debian stable release (10/buster). Debian is aiming to release roughly every 2 years. The last freeze was delayed at my request, to allow inclusion of the LTS kernel selected at the end of 2016, which was 4.9. So I would expect that the next Debian release will include the LTS kernel selected at the end of 2018. - Based on that answer, which ones are the best candidates?The kernel release cycle is 9-10 weeks, so I would expect the 2018 LTS kernel to be 4.19 or 4.20/5.0. (I think I heard Linus say 4.19 will be followed by 5.0, the same way 3.19 was followed by 4.0.) - Do you expect LTS and Debian selection to be aligned?Absolutely. - If not, if this something you would like to see happening?I don't think CIP needs to do anything to encourage this. Ben. -- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Re: [PATCH 0/2] Remove firmware files from CIP kernel
Jan Kiszka
On 2017-11-07 16:28, Jan Kiszka wrote:
On 2017-11-07 16:16, Ben Hutchings wrote:It's identical.On Mon, 2017-11-06 at 15:33 +0100, Jan Kiszka wrote:Yeah, I was afraid of that to happen (and had to use --no-validate toThis backports the removal of in-kernel firmware files that upstreamRight, this seems like a reasonable change. Jan
|
|
Y2038 meeting at ELCE
Ben Hutchings <ben.hutchings@...>
There was a meeting at ELCE about the status and development of Y2038-
safe APIs in Linux and GNU libc. This included several developers from members of CIP, plus Arnd Bergmann, Mark Brown and Albert Aribaud. Below are my notes from the meeting. Ben. ## Kernel Arnd Bergmann started working on Y2038 about 6 years ago, initially looking at file-systems and VFS. File-systems are mostly ready but VFS hasn't been switched over yet. The largest missing piece is the syscall interface. "We know what we want to do." There was a complete patch set for an older version, but it has not been completely rebased onto current mainline. On 32-bit systems about 35 syscalls use 32-bit time, but half of those are already obsolete. Something like 22 new syscalls will be needed. Network, sound, media, key management, input have not been dealt with yet. Patches are available for some of these but they can be invasive and hard to review. This is a low priority for some subsystem maintainers. About 100 device drivers need changes, ranging from an obvious 1 line change to a week's work. About 10% of the changes are needed for Y2038 safety on both 32-bit and 64-bit architectures, the rest only for 32-bit. Arnd wants to include a kconfig option to disable the 32-bit time APIs, so that any remaining users are easy to detect. ## GNU libc Albert Aribaut talked about the status of glibc. It will need to support both 32-bit `time_t` and 64-bit `time_t` independently of the kernel. A draft specification for this exists at <https://sourceware.org/glibc/wiki/Y2038ProofnessDesign>. About 60 APIs are affected, using `time_t` or derived type. Ideally source can be rebuilt to use 64-bit `time_t` just by defining the feature macro to enable it. The implementation is not complete, partly because syscalls haven't yet been defined. ## Other C libraries Arnd says some other C libraries will support 64-bit `time_t` but as a build-time option. I.e. libc and all applications must be built for either 32-bit or 64-bit `time_t`. ## Application compatibility issues If Unix timestamps are used in binary file formats or network protocols, these will need a new version. In some cases switching to unsigned 32-bit values is easy and will work for long enough. If `time_t` is used in library APIs then an ABI change is required. cppcheck(?) can find instances of this. Some libraries may use their own time types, so changing `time_t` won't be an ABI change but they will need to be updated anyway. Printing a value of type `time_t` with `printf()` and similar functions requires casting as there's no format specifier for it. It will be necessary to cast to `long long`, whereas previously `long` would work. The sparse static checker is supposed to be able to check for truncating conversions of `time_t`. ## Ongoing work in kernel and glibc A few people are working part time on this. Kernel patches are 60% done after 5 years, GNU libc about 75% (but only some of those changes have been applied). More people may be needed to speed this up and get it finished. The kernel side is coordinated through the y2038 mailing list: <https://lists.linaro.org/mailman/listinfo/y2038>. Patches are all sent to this mailing list. There is currently no git tree collecting them all. Help is wanted to: * Update device drivers * Review sound patches * Collect patches into single git tree The glibc side is coordinated through the general development mailing list: <https://www.gnu.org/software/libc/involved.html>, <https://sourceware.org/ml/libc-alpha/>. -- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Next CIP kernel: CIP - Debian alignment
Agustin Benito Bethencourt <agustin.benito@...>
Hi Ben,
one of the recurrent questions at ELCE, that also was done at the TSC meeting is when will we pick the following kernel. The following variables apply to this decision: * Some time ago, you made clear that, in order to keep the maintenance effort affordable, we should not pick kernels in tight cadence. TSC agreed to assume that advice. * Since we have agreed to base CIP-Core in Debian sources and during ELCE 2017 we picked Debian as the default distro for CIP, it seems a natural choice to pick LTS kernels that are also selected by Debian as preferred choice. Based on your understanding about the Debian project: - When do you expect the next Debian kernel to be selected? - Based on that answer, which ones are the best candidates? - Do you expect LTS and Debian selection to be aligned? - If not, if this something you would like to see happening? - If yes, how can CIP help to make it happen? Best Regards -- Agustin Benito Bethencourt Principal Consultant - FOSS at Codethink agustin.benito@...
|
|
Re: [PATCH 0/2] Remove firmware files from CIP kernel
Jan Kiszka
On 2017-11-07 16:16, Ben Hutchings wrote:
On Mon, 2017-11-06 at 15:33 +0100, Jan Kiszka wrote:Yeah, I was afraid of that to happen (and had to use --no-validate toThis backports the removal of in-kernel firmware files that upstreamRight, this seems like a reasonable change. get the patch out). Good that such a file is gone now. I'll cross-check, but the conflict I had to resolve was only in the top-level Makefile, and that was trivial. Thanks, Jan
|
|
Re: [PATCH 0/2] Remove firmware files from CIP kernel
Ben Hutchings <ben.hutchings@...>
On Mon, 2017-11-06 at 15:33 +0100, Jan Kiszka wrote:
This backports the removal of in-kernel firmware files that upstreamRight, this seems like a reasonable change. I couldn't apply your patch 1, apparently because some of the deleted ihex files have lines longer than the SMTP line limit, but I believe I've ended up with an equivalent commit. Ben. Jan-- Ben Hutchings Software Developer, Codethink Ltd.
|
|
Re: B@D network setup
Robert Marshall <robert.marshall@...>
Daniel Wagner <daniel.wagner@...> writes:
On 11/07/2017 01:09 PM, Robert Marshall wrote:From my experience with network issues at ELCE I'd be inclined to justWhen I tried to build my own VM from scratch I failed at the config partRobert, what is your take on this approach?I think that first step sounds reasonable - all the network config IIRC comment out the config.vm.network "public_network", use_dhcp_assigned_default_route: true line. And the VM should build without problems, it only needs that line for running a test on the beaglebone black Robert
|
|