CHANGELOG revision e8bd737d
12018-04-09 2 3 [API Change, OPTIMIZATION] Only process conns that need to be processed 4 5 The API is simplified: do not expose the user code to several 6 queues. A "connection queue" is now an internal concept. 7 The user processes connections using the single function 8 lsquic_engine_process_conns(). When this function is called, 9 only those connections are processed that need to be processed. 10 A connection needs to be processed when: 11 12 1. New incoming packets have been fed to the connection. 13 2. User wants to read from a stream that is readable. 14 3. User wants to write to a stream that is writeable. 15 4. There are buffered packets that can be sent out. (This 16 means that the user wrote to a stream outside of the 17 lsquic library callback.) 18 5. A control frame (such as BLOCKED) needs to be sent out. 19 6. A stream needs to be serviced or delayed stream needs to 20 be created. 21 7. An alarm rings. 22 8. Pacer timer expires. 23 24 To achieve this, the library places the connections into two 25 priority queues (min heaps): 26 27 1. Tickable Queue; and 28 2. Advisory Tick Time queue (ATTQ). 29 30 Each time lsquic_engine_process_conns() is called, the Tickable 31 Queue is emptied. After the connections have been ticked, they are 32 queried again: if a connection is not being closed, it is placed 33 either in the Tickable Queue if it is ready to be ticked again or 34 it is placed in the Advisory Tick Time Queue. It is assumed that 35 a connection always has at least one timer set (the idle alarm). 36 37 The connections in the Tickable Queue are arranged in the least 38 recently ticked order. This lets connections that have been quiet 39 longer to get their packets scheduled first. 40 41 This change means that the library no longer needs to be ticked 42 periodically. The user code can query the library when is the 43 next tick event and schedule it exactly. When connections are 44 processed, only the tickable connections are processed, not *all* 45 the connections. When there are no tick events, it means that no 46 timer event is necessary -- only the file descriptor READ event 47 is active. 48 49 The following are improvements and simplifications that have 50 been triggered: 51 52 - Queue of connections with incoming packets is gone. 53 - "Pending Read/Write Events" Queue is gone (along with its 54 history and progress checks). This queue has become the 55 Tickable Queue. 56 - The connection hash no longer needs to track the connection 57 insertion order. 58 592018-04-02 60 61 - [FEATURE] Windows support 62 63 - Reduce stack use -- outgoing packet batch is now allocated on the heap. 64 652018-03-09 66 67 - [OPTIMIZATION] Merge series of ACKs if possible 68 69 Parsed single-range ACK frames (that is the majority of frames) are 70 saved in the connection and their processing is deferred until the 71 connection is ticked. If several ACKs come in a series between 72 adjacent ticks, we check whether the latest ACK is a strict superset 73 of the saved ACK. If it is, the older ACK is not processed. 74 75 If ACK frames can be merged, they are merged and only one of them is 76 either processed or saved. 77 78 - [OPTIMIZATION] Speed up ACK verification by simplifying send history. 79 80 Never generate a gap in the sent packet number sequence. This reduces 81 the send history to a single number instead of potentially a series of 82 packet ranges and thereby speeds up ACK verification. 83 84 By default, detecting a gap in the send history is not fatal: only a 85 single warning is generated per connection. The connection can continue 86 to operate even if the ACK verification code is not able to detect some 87 inconsistencies. 88 89 - [OPTIMIZATION] Rearrange the lsquic_send_ctl struct 90 91 The first part of struct lsquic_send_ctl now consists of members that 92 are used in lsquic_send_ctl_got_ack() (in the absense of packet loss, 93 which is the normal case). To speed up reads and writes, we no longer 94 try to save space by using 8- and 16-bit integers. Use regular integer 95 width for everything. 96 97 - [OPTIMIZATION] Cache size of sent packet. 98 99 - [OPTIMIZATION] Keep track of the largest ACKed in packet_out 100 101 Instead of parsing our own ACK frames when packet has been acked, 102 use the value saved in the packet_out structure when the ACK frame 103 was generated. 104 105 - [OPTIMIZATION] Take RTT sampling conditional out of ACK loop 106 107 - [OPTIMIZATION] ACK processing: only call clock_gettime() if needed 108 109 - [OPTIMIZATION] Several code-level optimizations to ACK processing. 110 111 - Fix: http_client: fix -I flag; switch assert() to abort() 112 1132018-02-26 114 - [API Change] lsquic_engine_connect() returns pointer to the connection 115 object. 116 - [API Change] Add lsquic_conn_get_engine() to get engine object from 117 connection object. 118 - [API Change] Add lsquic_conn_status() to query connection status. 119 - [API Change] Add add lsquic_conn_set_ctx(). 120 - [API Change] Add new timestamp format, e.g. 2017-03-21 13:43:46.671345 121 - [OPTIMIZATION] Process handshake STREAM frames as soon as packet 122 arrives. 123 - [OPTIMIZATION] Do not compile expensive send controller sanity check 124 by default. 125 - [OPTIMIZATION] Add fast path to gquic_be_gen_reg_pkt_header. 126 - [OPTIMIZATION] Only make squeeze function call if necessary. 127 - [OPTIMIZATION] Speed up Q039 ACK frame parsing. 128 - [OPTIMIZATION] Fit most used elements of packet_out into first 64 bytes. 129 - [OPTIMIZATION] Keep track of scheduled bytes instead of calculating. 130 - [OPTIMIZATION] Prefetch next unacked packet when processing ACK. 131 - [OPTIMIZATION] Leverage fact that ACK ranges and unacked list are. 132 ordered. 133 - [OPTIMIZATION] Reduce function pointer use for STREAM frame generation 134 - Fix: reset incoming streams that arrive after we send GOAWAY. 135 - Fix: delay client on_new_conn() call until connection is fully set up. 136 - Fixes to buffered packets logic: splitting, STREAM frame elision. 137 - Fix: do not dispatch on_write callback if no packets are available. 138 - Fix WINDOW_UPDATE send and resend logic. 139 - Fix STREAM frame extension code. 140 - Fix: Drop unflushed data when stream is reset. 141 - Switch to tracking CWND using bytes rather than packets. 142 - Fix TCP friendly adjustment in cubic. 143 - Fix: do not generate invalid STOP_WAITING frames during high packet 144 loss. 145 - Pacer fixes. 146 1472017-12-18 148 149 - Fix: better follow cubic curve after idle period 150 - Fix: add missing parts to outgoing packet splitting code 151 - Fix: compilation using gcc 4.8.4 152 1532017-10-31 154 155 - Add APIs.txt -- describes LSQUIC APIs 156 1572017-10-31 158 159 - [API Change] Sendfile-like functionality is gone. The stream no 160 longer opens files and deals with file descriptors. (Among other 161 things, this makes the code more portable.) Three writing functions 162 are provided: 163 164 lsquic_stream_write 165 lsquic_stream_writev 166 lsquic_stream_writef (NEW) 167 168 lsquic_stream_writef() is given an abstract reader that has function 169 pointers for size() and read() functions which the user can implement. 170 This is the most flexible way. lsquic_stream_write() and 171 lsquic_stream_writev() are now both implemented as wrappers around 172 lsquic_stream_writef(). 173 174 - [OPTIMIZATION] When writing to stream, be it within or without the 175 on_write() callback, place data directly into packet buffer, 176 bypassing auxiliary data structures. This reduces amount of memory 177 required, for the amount of data that can be written is limited 178 by the congestion window. 179 180 To support writes outside the on_write() callback, we keep N 181 outgoing packet buffers per connection which can be written to 182 by any stream. One half of these are reserved for the highest 183 priority stream(s), the other half for all other streams. This way, 184 low-priority streams cannot write instead of high-priority streams 185 and, on the other hand, low-priority streams get a chance to send 186 their packets out. 187 188 The algorithm is as follows: 189 190 - When user writes to stream outside of the callback: 191 - If this is the highest priority stream, place it onto the 192 reserved N/2 queue or fail. 193 (The actual size of this queue is dynamic -- MAX(N/2, CWND) -- 194 rather than N/2, allowing high-priority streams to write as 195 much as can be sent.) 196 - If the stream is not the highest priority, try to place the 197 data onto the reserved N/2 queue or fail. 198 - When tick occurs *and* more packets can be scheduled: 199 - Transfer packets from the high N/2 queue to the scheduled 200 queue. 201 - If more scheduling is allowed: 202 - Call on_write callbacks for highest-priority streams, 203 placing resulting packets directly onto the scheduled queue. 204 - If more scheduling is allowed: 205 - Transfer packets from the low N/2 queue to the scheduled 206 queue. 207 - If more scheduling is allowed: 208 - Call on_write callbacks for non-highest-priority streams, 209 placing resulting packets directly onto the scheduled queue 210 211 The number N is currently 20, but it could be varied based on 212 resource usage. 213 214 - If stream is created due to incoming headers, make headers readable 215 from on_new. 216 217 - Outgoing packets are no longer marked non-writeable to prevent placing 218 more than one STREAM frame from the same stream into a single packet. 219 This property is maintained via code flow and an explicit check. 220 Packets for stream data are allocated using a special function. 221 222 - STREAM frame elision is cheaper, as we only perform it if a reset 223 stream has outgoing packets referencing it. 224 225 - lsquic_packet_out_t is smaller, as stream_rec elements are now 226 inside a union. 227 2282017-10-12 229 230 - Do not send RST_STREAM when stream is closed for reading 231 - Raise maximum header size from 4K to 64K 232 - Check header name and value lengths against maximum imposed by HPACK 233 - Fix NULL dereference in stream flow controller 234 2352017-10-09 236 237 - Hide handshake implementation behind a set of function pointers 238 - Use monotonically increasing clock 239 - Make sure that retx delay is not larger than the max of 60 seconds 240 2412017-09-29 242 243 - A few fixes to code and README 244 2452017-09-28 246 247 - Add support for Q041; drop support for Q040 248 2492017-09-27 250 251 - Fix CMakeLists.txt: BoringSSL include and lib was mixed up 252 2532017-09-26 254 255 - Add support for Mac OS 256 - Add support for Raspberry Pi 257 - Fix BoringSSL compilation: include <openssl/hmac.h> explicitly 258 2592017-09-22 260 261 - Initial release 262