Differences

This shows you the differences between two versions of the page.

--- linux:general:troubleshooting [2018/08/28 17:23] – lunetikk
+++ linux:general:troubleshooting [2020/12/03 15:12] (current) – [Linux starts in emergency mode - faulty logical volume (xfs)] lunetikk
@@ Line 1: / Line 1: @@
 ===== Troubleshooting =====
+==== Removing old kernels leads to broken symlinks ====
+=== Description ===
+apt-get autoremove leads to a broken symlink which requires a reload of grub
+<code>
+apt-get autoremove
+...
+The link /vmlinuz.old is a damaged link
+Removing symbolic link vmlinuz.old
+ you may need to re-run your boot loader[grub]
+The link /initrd.img.old is a damaged link
+Removing symbolic link initrd.img.old
+ you may need to re-run your boot loader[grub]
+</code>
+=== Reason ===
+Broken symlinks
+=== Fix ===
+Run "update-grub"
+<code>
+update-grub
+ Generating grub configuration file ...
+ Found linux image: /boot/vmlinuz-3.13.0-157-generic
+ Found initrd image: /boot/initrd.img-3.13.0-157-generic
+ Found linux image: /boot/vmlinuz-3.13.0-153-generic
+ Found initrd image: /boot/initrd.img-3.13.0-153-generic
+ Found memtest86+ image: /boot/memtest86+.elf
+ Found memtest86+ image: /boot/memtest86+.bin
+ done
+</code>
+\\
+\\
 ==== Linux starts in emergency mode - faulty logical volume (xfs) ====
@@ Line 55: / Line 94: @@
 Finally restart your system and pray...
-Have a look at this website for more xfs_repair related info\\
-[[http://fibrevillage.com/storage/666-how-to-repair-a-xfs-filesystem|fibrevillage.com - How to repair a xfs filesystem]]
 \\
 \\
@@ Line 91: / Line 128: @@
 mysqld would have been started with the following arguments:
---user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 --basedir=/usr --datadir=/var/lib/mysql --tmpdir=/tmp --lc-messages-dir=/usr/share/mysql --skip-external-locking --bind-address=127.0.0.1 --key_buffer=16M --max_allowed_packet=16M --thread_stack=192K --thread_cache_size=8 --myisam-recover=BACKUP --query_cache_limit=1M --query_cache_size=16M --log_error=/var/log/mysql/error.log --expire_logs_days=10 --max_binlog_size=100M
+--user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306...
 </code>
-Compare the defaults for "socket", both should be the same. If you get a different socket for your client, try to connect to your database by using the same
+Compare the defaults for "socket", both should be the same. If you get a different socket for your client, try to connect to your database by using the same as specified for mysqld
 <code>mysql --socket=/var/run/mysqld/mysqld.sock -hlocalhost -uroot -p</code>
@@ Line 125: / Line 162: @@
 The connection should work now.
 \\
+\\
+==== Bug: soft lockup in messages ====
+=== Description ===
+You can find multiple "Bug: soft lockup" entries in /var/log/messages or journalctl
+<code>
+May 25 07:23:59 XXXXXXX kernel: [13445315.881356] BUG: soft lockup - CPU#16 stuck for 23s! [yyyyyyy:81602]
+</code>
+=== Reason ===
+>A 'soft lockup' is defined as a bug that causes the kernel to loop in kernel mode for more than 20 seconds without giving other tasks a chance to run. The watchdog daemon will send an non-maskable interrupt (NMI) to all CPUs in the system who, in turn, print the stack traces of their currently running tasks.
+-SUSE KB [[https://www.suse.com/support/kb/doc/?id=7017652|7017652]]
+=== Fix ===
+__Solution 1:__
+Restart your system and/ or decrease your CPU load.
+__Solution 2:__
+Increase the time (default 10) before soft lockups are fired.
+<code bash >echo 20 > /proc/sys/kernel/watchdog_thresh</code>
+or
+<code bash>
+echo "kernel.watchdog_thresh=20" > /etc/sysctl.d/99-watchdog_thresh.conf
+sysctl -p  /etc/sysctl.d/99-watchdog_thresh.conf
+</code>
+\\
+\\
+==== systemctl runs in timeout ====
+=== Description ===
+In this example, installation of docker-ce with the following command doesnt work
+<code>
+curl -sSL https://get.docker.com | sh
+# Executing docker install script, commit: f45d7c11389849ff46a6b4d94e0dd1ffebca32c1
++ sh -c apt-get update -qq >/dev/null
++ sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
++ sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add -qq - >/dev/null
++ sh -c echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable" > /etc/apt/sources.list.d/docker.list
++ sh -c apt-get update -qq >/dev/null
++ [ -n  ]
++ sh -c apt-get install -y -qq --no-install-recommends docker-ce >/dev/null
+Broadcast message from systemd-journald@lunetikk (Wed 2019-10-23 00:22:12 CEST):
+systemd[1]: Caught <SEGV>, dumped core as pid 26368.
+Broadcast message from systemd-journald@lunetikk (Wed 2019-10-23 00:22:12 CEST):
+systemd[1]: Freezing execution.
+E: Sub-process /usr/bin/dpkg returned an error code (1)
+</code>
+Rerun "apt-get install docker-ce" shows the following
+<code>
+apt-get install docker-ce
+Reading package lists... Done
+Building dependency tree
+Reading state information... Done
+docker-ce is already the newest version (5:19.03.4~3-0~ubuntu-xenial).
+After this operation, 0 B of additional disk space will be used.
+Do you want to continue? [Y/n]
+Setting up docker-ce (5:19.03.4~3-0~ubuntu-xenial) ...
+Failed to execute operation: Connection timed out
+Failed to execute operation: Connection timed out
+Failed to retrieve unit state: Connection timed out
+Failed to start docker.service: Connection timed out
+See system logs and 'systemctl status docker.service' for details.
+invoke-rc.d: initscript docker, action "start" failed.
+Failed to get properties: Connection timed out
+dpkg: error processing package docker-ce (--configure):
+ subprocess installed post-installation script returned error exit status 1
+Errors were encountered while processing:
+ docker-ce
+E: Sub-process /usr/bin/dpkg returned an error code (1)
+</code>
+You cant reconfigure
+<code>
+dpkg-reconfigure docker-ce
+/usr/sbin/dpkg-reconfigure: docker-ce is broken or not fully installed
+</code>
+Listing the units for "systemctl status" runs in timeout
+<code>
+systemctl status docke<TAB>
+Failed to list unit files: Connection timed out
+Failed to list units: Connection timed out
+Failed to list unit files: Connection timed out
+</code>
+=== Reason ===
+\\
+In my case, my disk was "inconsistent". Reboot got me stuck in busybox. \\
+{{:linux:general:pasted:20191023-005512.png}}\\
+=== Fix ===
+\\
+I was able to run "fsck.ext4 /dev/vda2" to fix the orphaned inodes
+{{:linux:general:pasted:20191023-005629.png}}
+{{:linux:general:pasted:20191023-005648.png}}
+Reboot after this got me back onto my system and "systemctl" was working again.
 \\

Lunetikk's IT Wiki

User Tools

Site Tools

Differences

Page Tools