If your kernel version is 3.10.0-123 then you are *way* backlevel. That is the original 7.0 kernel and the rest of your system is likely to also be at the same 2.5 year old level. You should `yum update` and retest on 7.3.
Trevor
I upgraded the kernel version to 4.11.1-1.el7.elrepo.x86_64 #1 SMP Sun May 14 11:54:29 EDT 2017 and performed the tests. However the result remains unchanged - the test hangs. I can provide the complete log file if required.
There is something interesting going on around line 16698 ("[pid 22714] _exit(0)") in the log file excerpt.
[pid 22703] futex(0x7f7dd0000f20, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 22703] sendmsg(8, {msg_name(0)=NULL, msg_iov(3)=[{"PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n", 24}, {"\0\0\22\4\0\0\0\0\0\0\2\0\0\0\0\0\3\0\0\0\0\0\4\0\0\377\377", 27}, {"\0\0\4\10\0\0\0\0\0\0\17\0\1", 13}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 64 [pid 22703] clock_gettime(CLOCK_MONOTONIC, {1155, 662752484}) = 0 [pid 22703] poll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN|POLLOUT}], 3, 999) = 1 ([{fd=8, revents=POLLOUT}]) [pid 22703] clock_gettime(CLOCK_MONOTONIC, {1155, 662807734}) = 0 [pid 22703] poll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}], 3, 999 <unfinished ...> [pid 22714] clock_gettime(CLOCK_MONOTONIC, {1155, 662865341}) = 0 [pid 22714] sendmsg(8, {msg_name(0)=NULL, msg_iov(16)=[{"\0\1\22\1\4\0\0\0\1@\7", 11}, {":scheme", 7}, {"\4", 1}, {"http", 4}, {"@\7", 2}, {":method", 7}, {"\4", 1}, {"POST", 4}, {"@\5", 2}, {":path", 5}, {""", 1}, {"/command_server.CommandServer/Pi"..., 34}, {"@\n", 2}, {":authority", 10}, {"\v[::1]:42219@\r", 14}, {"grpc-encoding", 13}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 118 [pid 22714] sendmsg(8, {msg_name(0)=NULL, msg_iov(16)=[{"\10", 1}, {"identity", 8}, {"@\24", 2}, {"grpc-accept-encoding", 20}, {"\25", 1}, {"identity,deflate,gzip", 21}, {"@\2", 2}, {"te", 2}, {"\10", 1}, {"trailers", 8}, {"@\f", 2}, {"content-type", 12}, {"\20", 1}, {"application/grpc", 16}, {"@\n", 2}, {"user-agent", 10}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 109 [pid 22714] sendmsg(8, {msg_name(0)=NULL, msg_iov(8)=[{"%", 1}, {"grpc-c++/0.13.0 grpc-c/0.13.0 (l"..., 37}, {"@\f", 2}, {"grpc-timeout", 12}, {"\00310S\0\0\4\10\0\0\0\0\1\0\0", 15}, {"\377\377\0\0%\0\1\0\0\0\1\0\0\0\0", 15}, {" ", 1}, {"\n\03646fec02652d766b966d8d8da6f32ae", 32}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 115 [pid 22714] madvise(0x7f7dd6606000, 8368128, MADV_DONTNEED) = 0 [pid 22714] _exit(0) = ? [pid 22703] <... poll resumed> ) = 1 ([{fd=8, revents=POLLIN}]) [pid 22714] +++ exited with 0 +++ recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\0\0\f\4\0\0\0\0\0\0\3\177\377\377\377\0\4\0\20\0\0\0\0\4\10\0\0\0\0\0\0\17"..., 8192}], msg_controllen=0, msg_flags=0}, 0) = 183 sendmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\0\0\0\4\1\0\0\0\0", 9}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 9 recvmsg(8, 0x7ffc26b44450, 0) = -1 EAGAIN (Resource temporarily unavailable) open("/root/.cache/bazel/_bazel_root/038a9c24c67a3f14ac28680c554d9af8/server/server.pid.txt", O_RDONLY) = 9 read(9, "11702", 32) = 5 read(9, "", 27) = 0 close(9) = 0 readlink("/root/.cache/bazel/_bazel_root/038a9c24c67a3f14ac28680c554d9af8/install", "/root/.cache/bazel/_bazel_root/i"..., 4096) = 71 open("/root/.cache/bazel/_bazel_root/038a9c24c67a3f14ac28680c554d9af8/server/cmdline", O_RDONLY) = 9 read(9, "bazel(root)\0-XX:+HeapDumpOnOutOf"..., 4096) = 948 read(9, "", 4096) = 0 close(9) = 0 unlink("/root/.cache/bazel/_bazel_root/038a9c24c67a3f14ac28680c554d9af8/javalog.properties") = 0 open("/root/.cache/bazel/_bazel_root/038a9c24c67a3f14ac28680c554d9af8/javalog.properties", O_WRONLY|O_CREAT|O_TRUNC, 0755) = 9 write(9, "handlers=java.util.logging.FileH"..., 380) = 380 close(9) = 0 readlink("/proc/11702/cwd", "/root/bazel", 4096) = 11 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 32832517}) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigaction(SIGINT, {0x420c70, [INT], SA_RESTORER|SA_RESTART, 0x7f7dd8d12250}, {SIG_DFL, [], 0}, 8) = 0 rt_sigaction(SIGTERM, {0x420c70, [TERM], SA_RESTORER|SA_RESTART, 0x7f7dd8d12250}, {SIG_DFL, [], 0}, 8) = 0 rt_sigaction(SIGPIPE, {0x420c70, [PIPE], SA_RESTORER|SA_RESTART, 0x7f7dd8d12250}, {SIG_DFL, [], 0}, 8) = 0
Thanks, Atul.
On Wed, May 17, 2017 at 4:59 PM, Atul Sowani sowani@gmail.com wrote:
Hi,
I have observed that certain Bazel test cases from the test suite are timing out on CentOS 7 (kernel version 3.10.0-123.el7.x86_64). For example, I tried bazel_coverage_test (using command bazel test //src/test/shell/bazel:bazel_coverage_test) and observed that it just hangs. I tried tracing it using strace (log attached).
This seems to be CentOS specific behavior as I did not observe this on Ubuntu 16.04.
Has anybody observed this? Is this a regression as far as CentOS is concerned?
Thanks, Atul.