Overview
Turbomachinery preliminary design demands instant feedback on thermodynamic and flow distributions — velocity, pressure, temperature and density — across the meridional plane and blade surfaces. A full CFD solve is far too expensive at this stage, so a lighter method is needed.
This project implements a quasi-3D streamline-curvature solver based on the method described by Katsanis (1966) for arbitrary quasi-orthogonal (QO) lines. The algorithm was first prototyped in Python for rapid iteration, then rewritten from scratch in C++ for production integration with the host CAE platform.
The production solver achieves sub-second solve times for dozens of iterations, validated against an existing hub-to-shroud solver and CFD results. It was accepted as a beta feature and merged into the main codebase.
Research Foundation
Katsanis (1966) — Arbitrary Quasi-Orthogonals
The solver is based on the NASA paper “Use of Arbitrary Quasi-Orthogonals for Calculating Flow Distribution in a Turbomachine” by T. Katsanis (1966). It presents a streamline-curvature method that solves the velocity-gradient equation along user-defined QO lines spanning from hub to shroud, iteratively repositioning streamlines until mass-flow balance is achieved.
The original paper included Fortran IV source code and a sample test case that converged in 2–3 minutes on 1960s hardware — confirming the algorithm was computationally inexpensive enough to run in the millisecond range on modern machines. I was able to read and follow the Fortran code (barely), and several critical convergence safeguards and edge-case treatments were extracted directly from it.
Python Prototype
Rapid Algorithm Prototyping
Development began with a Python prototype that allowed fast iteration on the solver logic and immediate visual feedback via Matplotlib. The prototype built machine/segment models from hub and shroud Bézier curves, initialized QO lines and streamlines, ran the iterative solver loop, and plotted results as line charts.
A key challenge was evaluating blade angle derivatives (, ) and tangential thickness — these depend on blade-generation methods internal to the host platform. As a workaround, data was exported from the platform for a grid of values and polynomial regression was used to fit surfaces for angle and thickness.
While the Python solver worked, it had convergence issues and was too slow for production use. Connecting it to the C++ host would also incur significant data-transfer overhead. This motivated a full rewrite in C++.
C++ Production Implementation
Solver Engine
The C++ rewrite follows the same physical algorithm but is structured for production robustness. The solver engine manages inlet/fluid initialization, geometry ingestion from host-platform objects, QO-line and streamline setup, the main iterative solve loop (flow population, velocity marching, mass-flow balancing, streamline repositioning, convergence checks), and post-processing for blade-loading output.
A critical bug found by examining the original Fortran code was that the hub-velocity prediction — an iterative sub-process that finds the hub velocity giving the correct mass-flow rate at each QO line — had been overlooked and treated as a single-step computation. Implementing the full iterative predictor (derived from a Fortran subroutine in the original code) resolved the convergence issues that plagued the Python prototype.
Blade Data Access
Instead of the polynomial-regression workaround from Python, the C++ solver extracts blade-angle and thickness surfaces directly from the host platform's blade-row objects. For any query, a line parallel to the axis intersects the surface to yield the required angle value; surface iso-curve tangents give the partial derivatives via linear algebra.
Direct surface intersection was accurate but slow. A 1D cubic-spline interpolation strategy exploits the fact that QO lines are fixed: before the main loop, blade data is sampled along each QO line and approximated with cubic splines. Inside the solve loop, fast 1D lookups replace expensive 3D surface intersections — dramatically reducing per-iteration cost.
Performance Engineering
Structure-of-Arrays (SOA) Data Model
The solver maintains ~50 flow variables (pressure, temperature, density, velocity components, blade angles, curvature, etc.) at every grid point — each streamline × QO-line intersection. Originally stored as a 2D array of structs (AOS), field access required two pointer indirections: outer vector → inner vector → struct → field.
Switching to a structure-of-arrays layout — where each variable is stored in a contiguous 1D std::vector indexed by row * cols + col — dramatically improved performance. Access becomes struct → vector → data, and sequential sweeps over a single variable (the dominant pattern in the solver loop) hit L1/L2 cache almost every time.
The SOA container uses C++ template metaprogramming with tag-dispatched type descriptors: each flow variable is a unique type, and compile-time index resolution via std::tuple and fold expressions eliminates all runtime lookup overhead. Direct data pointers are also exposed for potential SIMD use.
This single change — AOS → SOA — brought the solver to sub-second total solve time for dozens of iterations on typical machine geometries (10–20 streamlines × 10–30 QO lines).
OpenMP: A Lesson in Overhead
With the sequential solver already running under a second, OpenMP parallel directives were tested to see if further speedup was possible. The result was slower execution — the thread-creation and synchronization overhead from #pragma omp parallel exceeded the computation time of each loop body.
The grid sizes in typical preliminary design (hundreds to low thousands of points) are simply too small for the parallelization to amortize the fixed cost of thread management. The lesson: for small, cache-friendly workloads, a tight sequential loop with good data locality beats multi-threading.
OpenMP remains available as a build option for unusually large meshes, but the default path stays single-threaded.
Geometry Layer
Meridional Plane Curves
The solver operates on a 2D grid defined by the intersection of streamlines and QO lines in the meridional plane. Hub and shroud boundaries are represented as Bézier curves built from control points extracted from the host platform's flow-path geometry.
Streamlines are interpolated as B-spline curves through their grid points using OpenCASCADE's interpolation API. Geometric queries — arc-length parameterization, curvature, slope, and QO-line intersection — are performed via the OCC adaptor layer, wrapped in a custom curve class hierarchy.
QO lines are straight segments from hub to shroud, and the solver repositions streamlines by adjusting each streamline's parameter along every QO line until equal mass-flow is carried between adjacent streamlines.
Interactive Output
Qt/VTK Task Window
The solver is exposed through a dedicated Qt task window with three visualization panels:
- Streamline geometry plot: displays the converged streamlines and QO lines forming the computational grid.
- QO-line data plot: shows any selected thermodynamic variable along a chosen QO line (hub → shroud).
- 2D contour plot: a VTK-rendered colour-mapped visualization of any flow variable across the entire meridional plane, using Delaunay triangulation of the grid points.
Additional controls let the user select which stage and segment to analyze, configure solver parameters (iteration limits, relaxation, tolerances), choose between interpolation or direct blade-data evaluation, and select the inlet-condition side.
Validation & Results
The solver was cross-compared against the existing hub-to-shroud (H2S) solver and full CFD results across multiple test cases. When accuracy was deemed acceptable for the massive time advantage it provides (sub-second vs. minutes), it was merged as a beta feature.
Comprehensive documentation was prepared alongside the submission to the central repository, covering the algorithm theory, API usage, solver settings, and limitations.
Software Architecture
Module Structure
The production codebase is organized into focused modules that mirror the solver's logical stages. The details of the implementation are proprietary, but the high-level architecture can be summarized:
- Solver engine — orchestrates the full pipeline: settings, initialization, main solve loop, blade-loading, and post-processing. Exposes a clean public API for the task-window layer.
- Data model — the template metaprogramming SOA container holding ~50 flow variables with compile-time typed access. Also defines all data extraction enums and structures.
- Curve layer — OCC-based hierarchy for Bézier/B-spline flow-path curves, streamlines, QO lines, and a custom cubic-spline interpolator.
- Numerical utilities — differentiation, integration, interpolation helpers, and a Fortran-derived iterative velocity predictor.
- Contour widget — VTK-based 2D contour visualization with configurable colour maps, contour levels, and interactive settings.
- Task window — Qt integration layer connecting the solver to the platform's workflow, handling stage/segment selection, inlet-condition import, and all three plot panels.