Site Tools


linux:general:troubleshooting

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
linux:general:troubleshooting [2018/08/28 17:23] lunetikklinux:general:troubleshooting [2020/12/03 15:12] (current) – [Linux starts in emergency mode - faulty logical volume (xfs)] lunetikk
Line 1: Line 1:
 ===== Troubleshooting ===== ===== Troubleshooting =====
 +
 +==== Removing old kernels leads to broken symlinks ====
 +
 +=== Description ===
 +
 +apt-get autoremove leads to a broken symlink which requires a reload of grub
 +
 +<code>
 +apt-get autoremove
 +...
 +The link /vmlinuz.old is a damaged link
 +Removing symbolic link vmlinuz.old
 + you may need to re-run your boot loader[grub]
 +The link /initrd.img.old is a damaged link
 +Removing symbolic link initrd.img.old
 + you may need to re-run your boot loader[grub]
 +</code>
 +
 +=== Reason === 
 +
 +Broken symlinks
 +
 +=== Fix === 
 +
 +Run "update-grub"
 +<code>
 +update-grub
 + Generating grub configuration file ...
 + Found linux image: /boot/vmlinuz-3.13.0-157-generic
 + Found initrd image: /boot/initrd.img-3.13.0-157-generic
 + Found linux image: /boot/vmlinuz-3.13.0-153-generic
 + Found initrd image: /boot/initrd.img-3.13.0-153-generic
 + Found memtest86+ image: /boot/memtest86+.elf
 + Found memtest86+ image: /boot/memtest86+.bin
 + done
 +</code>
 +
 +\\
 +\\
  
 ==== Linux starts in emergency mode - faulty logical volume (xfs) ==== ==== Linux starts in emergency mode - faulty logical volume (xfs) ====
Line 55: Line 94:
 Finally restart your system and pray... Finally restart your system and pray...
  
-Have a look at this website for more xfs_repair related info\\ 
-[[http://fibrevillage.com/storage/666-how-to-repair-a-xfs-filesystem|fibrevillage.com - How to repair a xfs filesystem]] 
 \\ \\
 \\ \\
Line 91: Line 128:
  
 mysqld would have been started with the following arguments: mysqld would have been started with the following arguments:
---user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 --basedir=/usr --datadir=/var/lib/mysql --tmpdir=/tmp --lc-messages-dir=/usr/share/mysql --skip-external-locking --bind-address=127.0.0.1 --key_buffer=16M --max_allowed_packet=16M --thread_stack=192K --thread_cache_size=8 --myisam-recover=BACKUP --query_cache_limit=1M --query_cache_size=16M --log_error=/var/log/mysql/error.log --expire_logs_days=10 --max_binlog_size=100M+--user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306...
 </code> </code>
  
-Compare the defaults for "socket", both should be the same. If you get a different socket for your client, try to connect to your database by using the same+Compare the defaults for "socket", both should be the same. If you get a different socket for your client, try to connect to your database by using the same as specified for mysqld
 <code>mysql --socket=/var/run/mysqld/mysqld.sock -hlocalhost -uroot -p</code> <code>mysql --socket=/var/run/mysqld/mysqld.sock -hlocalhost -uroot -p</code>
  
Line 125: Line 162:
 The connection should work now.  The connection should work now. 
 \\ \\
 +\\
 +
 +==== Bug: soft lockup in messages ====
 +
 +=== Description ===
 +
 +You can find multiple "Bug: soft lockup" entries in /var/log/messages or journalctl
 +
 +<code>
 +May 25 07:23:59 XXXXXXX kernel: [13445315.881356] BUG: soft lockup - CPU#16 stuck for 23s! [yyyyyyy:81602]
 +</code>
 +
 +=== Reason === 
 +
 +>A 'soft lockup' is defined as a bug that causes the kernel to loop in kernel mode for more than 20 seconds without giving other tasks a chance to run. The watchdog daemon will send an non-maskable interrupt (NMI) to all CPUs in the system who, in turn, print the stack traces of their currently running tasks. 
 +-SUSE KB [[https://www.suse.com/support/kb/doc/?id=7017652|7017652]]
 +
 +=== Fix === 
 +
 +__Solution 1:__
 +
 +Restart your system and/ or decrease your CPU load.
 +
 +__Solution 2:__
 +
 +Increase the time (default 10) before soft lockups are fired.
 +
 +<code bash >echo 20 > /proc/sys/kernel/watchdog_thresh</code> 
 +or
 +<code bash>
 +echo "kernel.watchdog_thresh=20" > /etc/sysctl.d/99-watchdog_thresh.conf
 +
 +sysctl -p  /etc/sysctl.d/99-watchdog_thresh.conf
 +</code>
 +\\
 +\\
 +
 +
 +==== systemctl runs in timeout ====
 +
 +=== Description ===
 +
 +In this example, installation of docker-ce with the following command doesnt work
 +<code>
 +curl -sSL https://get.docker.com | sh
 +
 +# Executing docker install script, commit: f45d7c11389849ff46a6b4d94e0dd1ffebca32c1
 ++ sh -c apt-get update -qq >/dev/null
 ++ sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
 ++ sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add -qq - >/dev/null
 ++ sh -c echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable" > /etc/apt/sources.list.d/docker.list
 ++ sh -c apt-get update -qq >/dev/null
 ++ [ -n  ]
 ++ sh -c apt-get install -y -qq --no-install-recommends docker-ce >/dev/null
 +
 +Broadcast message from systemd-journald@lunetikk (Wed 2019-10-23 00:22:12 CEST):
 +
 +systemd[1]: Caught <SEGV>, dumped core as pid 26368.
 +
 +
 +Broadcast message from systemd-journald@lunetikk (Wed 2019-10-23 00:22:12 CEST):
 +
 +systemd[1]: Freezing execution.
 +
 +E: Sub-process /usr/bin/dpkg returned an error code (1)
 +</code>
 +
 +Rerun "apt-get install docker-ce" shows the following
 +<code>
 +apt-get install docker-ce
 +Reading package lists... Done
 +Building dependency tree
 +Reading state information... Done
 +docker-ce is already the newest version (5:19.03.4~3-0~ubuntu-xenial).
 +After this operation, 0 B of additional disk space will be used.
 +Do you want to continue? [Y/n]
 +Setting up docker-ce (5:19.03.4~3-0~ubuntu-xenial) ...
 +Failed to execute operation: Connection timed out
 +Failed to execute operation: Connection timed out
 +Failed to retrieve unit state: Connection timed out
 +Failed to start docker.service: Connection timed out
 +See system logs and 'systemctl status docker.service' for details.
 +invoke-rc.d: initscript docker, action "start" failed.
 +Failed to get properties: Connection timed out
 +dpkg: error processing package docker-ce (--configure):
 + subprocess installed post-installation script returned error exit status 1
 +Errors were encountered while processing:
 + docker-ce
 +E: Sub-process /usr/bin/dpkg returned an error code (1)
 +</code>
 +
 +You cant reconfigure
 +<code>
 +dpkg-reconfigure docker-ce
 +/usr/sbin/dpkg-reconfigure: docker-ce is broken or not fully installed
 +</code>
 +
 +Listing the units for "systemctl status" runs in timeout
 +<code>
 +systemctl status docke<TAB>
 +Failed to list unit files: Connection timed out
 +Failed to list units: Connection timed out
 +Failed to list unit files: Connection timed out
 +</code>
 +
 +=== Reason ===  
 +\\
 +
 +In my case, my disk was "inconsistent". Reboot got me stuck in busybox. \\
 +
 +{{:linux:general:pasted:20191023-005512.png}}\\
 +
 +=== Fix ===  
 +\\
 +
 +I was able to run "fsck.ext4 /dev/vda2" to fix the orphaned inodes
 +
 +{{:linux:general:pasted:20191023-005629.png}}
 +
 +{{:linux:general:pasted:20191023-005648.png}}
 +
 +Reboot after this got me back onto my system and "systemctl" was working again.
 +
 \\ \\
  
linux/general/troubleshooting.1535469787.txt.gz · Last modified: 2018/08/28 17:23 by lunetikk